Structured data acquisition, cleaning, normalization, and matching

The Internet, remember the Internet came before AI, contains a massive amount of structured and unstructured data. Google indexes the unstructured data and makes it searchable. Problem solved. However, structured data on web pages is generated from offline databases and some times in the case of stores from corporate databases. Structured data which was nicely […]

Read more "Structured data acquisition, cleaning, normalization, and matching"

Developing supervised learning deep neural network (NN) classifiers

I see several big problems with developing supervised deep learning neural network (NN) classifiers. The accuracy of the of classification depends on the quality/quality of the labeled data and the topology of the network. Pretrained weights can reduce the amount of time required to train the network significantly. However, problems can crop with matching public […]

Read more "Developing supervised learning deep neural network (NN) classifiers"

Why companies need matched product records, bad data detection, and data fixes/suggestions

The basis for all transactions involving products on the Internet is Quality product data. Knowing what the product record is in each store’s web page and matching the same products on different sites is part of the Quality Product Data equation. Quality Product Data results in higher conversions and hence revenues. Lets look at which […]

Read more "Why companies need matched product records, bad data detection, and data fixes/suggestions"

Creating Quality Product Data

Product data is available in data feeds and on web sites. In order to create Quality Data bad data must be detected in data feeds and on web sites. The bad/missing product data must be recognized in order to make good matched records. I found searching for products on the web infuriating because: Sites failed […]

Read more "Creating Quality Product Data"