Case Studies GoGlocal: Pricing & Product Intelligence Across 1,000+ SKUs on Amazon, eBay, Walmart & Lazada
analytics E-Commerce Technology · 2023

GoGlocal: Pricing & Product Intelligence Across 1,000+ SKUs on Amazon, eBay, Walmart & Lazada

As Manager of Data Science at GoGlocal, I built the cross-border e-commerce data product the merchandising team ran on: NLP product classification and attribute extraction, marketplace-aware pricing intelligence, and competitor analysis across Amazon, eBay, Walmart, and Lazada - cutting manual listing effort by 50% and improving revenue-estimation efficiency by 30%.

Problem

GoGlocal's merchandising team was manually classifying, pricing, and listing 1,000+ SKUs across four international marketplaces - each with its own taxonomy, fee structure, and pricing dynamics - with no systematic source for pricing intelligence, product classification, or competitor positioning.

Outcome

An NLP classification, pricing-intelligence, and competitor-analysis product that automated the most labor-intensive listing and pricing work across 1,000+ SKUs on 4 marketplaces - 50% less manual effort and 30% better revenue-estimation efficiency.

Impact - who used it & what changed

GoGlocal's merchandising team used the pricing and classification outputs to set prices and list products across Amazon, eBay, Walmart, and Lazada - replacing per-SKU manual research and hand judgment with marketplace-aware recommended price points and auto-classified, attribute-tagged listings.

Overview

At GoGlocal I owned the full data product - not a component of someone else’s system, the whole thing - and the people who used its output every day were the merchandising team.

The problem was cross-border e-commerce at scale: 1,000+ SKUs selling across Amazon, eBay, Walmart, and Lazada, each with different listing requirements, pricing dynamics, and buyer behavior. The merchandising team was handling all of it by hand. Every hour spent researching a competitor’s price or classifying a product was an hour not spent on strategy. Margins in cross-border e-commerce are thin; operational inefficiency is a competitive disadvantage that compounds.

As Manager of Data Science, I directed the strategy and built the core machine learning systems that fed the team the two things they spent the most time producing manually: where a product belongs and what it should sell for. 50% reduction in manual effort. 30% improvement in revenue-estimation efficiency.

The Problem

Cross-border e-commerce at the SKU level is repetitive and data-intensive. Product classification, keyword optimization, pricing research, competitor checks, listing creation - done by hand, across four marketplaces, for over a thousand products. At that volume it was operationally untenable, and it scaled linearly with headcount instead of with the catalog.

The compounding challenge: different marketplaces have different taxonomies, listing formats, and pricing signals. A price or classification that worked on Amazon did not transfer to Lazada. Every output the team consumed had to be marketplace-aware, or it was wrong on three marketplaces out of four.

Data & Inputs

Data scraping was itself a significant engineering challenge. Scraping at scale across regions with anti-bot measures required a robust, distributed pipeline - that was the infrastructure that had to exist before any Machine Learning or any usable output reached the team.

Approach

Three layers. Each one feeds the next, and the team consumed the output of all three.

Layer 1: SKU Intelligence (classification & attribute extraction) NLP-driven product classification using semantic similarity and attribute extraction. Given a product with its description and attributes, the system automatically placed it in the correct marketplace taxonomy, extracted the key searchable attributes, and generated optimized listing copy using fine-tuned OpenAI models. This is what replaced the team manually deciding which category a SKU belonged to on each marketplace.

Image optimization using Stable Diffusion for background replacement and product visualization - a real commercial requirement for marketplace listing quality that most teams handle by hand.

Layer 2: Pricing Intelligence & Competitor Analysis Scraped competitor pricing was normalized and used to train price-response models at the category level. The system recommended an optimal price point per SKU per marketplace, accounting for marketplace fees, currency, and competitive position. Combined with sales forecasting and inventory levels, it produced an expected profit margin per SKU per marketplace, so the team could prioritize the most valuable listing opportunities rather than guess. This pricing output is what the merchandising team used to set prices across marketplaces.

Layer 3: Demand Forecasting & Inventory Planning SKU-level demand forecasting using XGBoost and LightGBM with marketplace-specific features. Trend tracking to catch velocity changes early - before stockouts or overstock developed - and replenishment recommendations based on forecast plus lead time plus safety stock.

Engineering & Implementation

Multi-tenant architecture to support different merchant accounts with isolated data and compute.

What Changed for the Team

The merchandising team stopped doing per-SKU pricing research and manual category mapping and started working from the system’s output:

The net effect, measured across 1,000+ SKUs on 4 marketplaces: 50% reduction in manual effort and 30% improvement in revenue-estimation efficiency - faster, more accurate per-SKU forecasts.

Limitations & What I’d Do Differently

The marketplace taxonomy mapping was hand-crafted for the initial set of categories. A proper learned taxonomy alignment would scale better as new product categories are added - without requiring manual extension every time the catalog expands.

The demand forecasting models were trained per-marketplace in isolation. A hierarchical model that shares information across marketplaces would improve accuracy for low-velocity SKUs where per-marketplace data is sparse.

Stack

Python, Ray, AWS (EC2, S3, Lambda), OpenAI API (fine-tuned), Stable Diffusion, sentence-transformers, XGBoost, LightGBM, FastAPI, PostgreSQL, Redis

Stack

Python Ray AWS OpenAI API Stable Diffusion Scikit-learn FastAPI PostgreSQL
nlp e-commerce automation forecasting pricing

Have a problem worth solving?

Whether you need a quantitative researcher, a Machine Learning systems builder, or a technical advisor, I take a small number of consulting engagements at a time.

Book a call →