Any product team can build real-time personalization overnight with Miso

Taha Kagadawala

November 17th, 2022

Any product team can build real-time personalization overnight with Miso

When I joined the Miso.ai team last January as a DevEx Engineer, I was excited about working in the personalization space. I had a background in API-first design, but hadn’t had the chance to apply it in personalization. Thanks to not living under a rock for the last 10 years, I was familiar with the concept of personalization (which isn’t saying much, since we’re living in an era where anyone can binge almost any TV show or movie, then play decades worth of music, and then buy forever online till their credit card company cuts them off…but I digress). Without personalization, the internet would go from a place where I can find anything to a labyrinth that I’d be forever trapped searching in.

So while I felt like I had a good grasp on personalization already, I admittedly didn’t know a whole lot about real-time personalization and just how transformative it was. Over the last 10 months, I’ve poured over books and articles, honed my skills under CTO Andy’s tutelage (whose doctoral dissertation is the foundation of Miso), and most importantly, talked to tons of companies about how real-time personalization fits into their growth roadmap.

While there’s still so much more to learn and unpack, here’s what I’ve learned.

Personalization versus real-time personalization. What’s the difference?

I’ll start with a boilerplate definition of static, or non-real-time, personalization:

Personalization is showing your customers content that’s uniquely relevant to them, at a (ideally) 1:1 level and across marketing channels. This could be based on their historical behavioral patterns, such as browsing and purchase history, an onboarding survey, or any other explicit or implicit signal of their preferences.

Miso.ai, on the other hand, is a real-time 1:1 personalization engine. Simply put: a real-time personalization engine learns from your users’ clickstream data as it’s being generated — click throughs, searches, add to carts, likes, dislikes, scroll depth, pretty much anything that’s happening within a browsing session — and uses that data to create uniquely personalized search rankings and recommendations on a 1:1 basis. A great real-time personalization engine should work at cold-start and be able to start modeling the experience in as few clicks as possible (by the way, it only takes Miso two to three clicks to triangulate a user and then fine tune their search, recommendation and discovery feed touchpoints, and it’s all accomplished without using cookies, pixels, or device fingerprinting).

For all you nerds out there — 1:1, but static personalization is like the very first Mark I Iron Man suit. Powerful, cool, but not the pinnacle of what could be.

But real-time, continuous, dynamic, 1:1 personalization is like the final Iron Man Mark 85 suit (the one with the nano-bots, where Tony could make the suit into anything he needed)

How Miso’s real-time personalization works under the hood

Our approach to real-time personalization is to utilize your first-party data (product and clickstream data that many organizations already collect) along with transformer-based machine learning to create multivariate vector embeddings for users and products. Embeddings are created for every user in real-time, based on behavioral patterns — or interactions — that tell the story of the user’s session. The clickstream data is cached and any new events trigger a recalculation of the user embeddings on the fly. This gives Miso’s personalization engines insight into things like consumer interests, brand affinity, style preference, and price sensitivity, within just a few clicks. For the product catalog, Miso’s engines create embeddings based on the product’s metadata, such as the title, description, brand, price, and overall popularity relative to the other products in the catalog. Then, a dot product is performed between the user and product embeddings to map them in the same vector space. This enables Miso to cluster similar products, similar users, and connect users to products that they are likely to be interested in.

P.S. if you really want to nerd out on this topic, here’s the original paper Andy published while at the small data lab, with his PhD and post-doc advisor, Deborah Estrin.

There’s so much that’s cool to this approach at an ML level, but the practical product UX aspect of this that I personally find really powerful is in our solution to the cold-start problem (i.e. making recommendations for a visitor who doesn’t have any previous interactions on your site). Here’s a brief demo to show you what it looks like for a brand new visitor visiting a marketplace:

Just for context, in this video I’m creating a new user instance. Pay close attention to the left-hand panel where the current user’s interest graph tags are generated. Since this is a new user (and Miso doesn’t have any tags yet), my homepage recommendations are just based on what’s popular and trending for all users right now on the marketplace. However, as soon as I search for “plushies”, the user embedding starts being built out. Every time I click through to a product detail page, the tags are instantly updated, ready for my next interaction.

Finally, when I stop looking at plushies and pivot into a search for “stickers” (a different category altogether), you can see that the aesthetic and latent features of the types of plushies I was looking for is carried over to my stickers query: cute corgis, boba, and crocheting. And all of that happened in just a few clicks.

This combination of product, user, and query understanding can have a dramatic effect on conversions, GMV lift, and overall liquidity of your site and it all happens in real-time, at cold-start.

Implementing a real-time clickstream data pipeline with Miso. Many paths, one destination.

Whether you’re upgrading from a collaborative filtering recommender system or a monolithic search engine or diving directly into real-time personalization, you’ll need to figure out how to automatically feed your first-party data into your personalization engine. Typically, two types of data are required for your Miso engines: your clickstream logs and the latest snapshot of your product catalog. The clickstream data needs to be sent as soon as it’s generated, whereas the catalog data can be refreshed as often as your business requires (usually at least once daily).

Note that for the initial model training, you’ll likely just need to provide a historic snapshot of your clickstream and product catalog data. Over time, as more clickstream is loaded in real-time, the model will improve via reinforcement learning. We’ve found that ~3 months of archived clickstream data is enough to get a good first-pass initial model, which is in contrast to many ML methods for personalization and recommendation systems that can require years of data to work well. This historical data is useful for the initial model calibration and for returning users, but to unlock the benefits of real-time, continuous, dynamic, in-session personalization for brand new users, you’ll still need to set up a real-time clickstream data pipeline.

Depending on your tech stack and developer capacity, there are numerous options that we’ve seen work well. I’ve outlined some patterns below and linked to the corresponding guides that show you the approach in greater detail.

Pattern 1 : Backend, using APIs

In my experience, this is the most common way for teams that already collect clickstream data to set up a real-time data pipeline to Miso. This involves using your backend server to access Miso’s Data API. An example would be triggering events using JavaScript tags from a service like Snowplow or MetaRouter for interactions like page views and “add to carts” and pushing to an event streaming service like Apache Kafka or Amazon Kinesis. In this type of system architecture, Miso would be an additional consumer and would receive the events via REST API.

For more on Miso’s API-first approach, check out our related guide.

Pattern 2: Front-end, using SDK

If you don’t currently collect clickstream data, a good low-code option is to simply push clickstream events directly from the client’s browser. Miso has a client-side SDK for JavaScript that enables this functionality in a very straightforward way. Note, that this would allow your clickstream data to stream in real-time, but you’d still be regularly updating your product catalog by other means (as well as uploading your historical clickstream data to get started).

For reference, here are a couple guides to putting this approach to work with our SDK and a 3rd party data pipeline tool, Meltano:

Pattern 3: 3rd Party Customer Data Platform

If you already collect clickstream data for your site and put it in a cloud-based CDP like Segment, you may be able to use a webhook to push those events in real-time into your real-time personalization engine.

This type of pipeline is easy to implement and doesn’t lead to redundancy in data collection.

Check out our guide to integrating with Segment.

Pattern 4: Commerce and Content Platforms

Popular platforms such as Shopify and Wordpress maintain an ecosystem of 3rd party apps. Miso already built has plugins available to automatically sync catalog and clickstream data. This method typically doesn’t require any developer resources, as it’s done using a point-and-click interface.

Here’s a look at our Shopify integration.

A Note about Product Catalog Data

Now that I’ve spent some time unpacking the importance of real-time clickstream data and various ways of implementing a pipeline for it, I also want to address the other critical part of the data requirement for real-time personalization: product catalog data.

Product catalog data refers to the set of metadata about the products you’re selling (or publishing for content media sites). Some examples include: the title, description, category, price, brand, images, author, year, and color. Upon model training, Miso creates vector embeddings of each product to understand it at a deep, semantic level and maps the embeddings alongside other products and other users (this drives our product cold-start optimization). In other words, you can’t have real-time personalization without real-time clickstream data and product catalog data.

Miso works well with catalogs of all sizes, ranging from sub-200 to hundreds of thousands. Similar to the real-time clickstream data, you will need a historical data load for the initial engine training as well as a data pipeline in place for future updates. Unlike the clickstream data pipeline, however, the product catalog pipeline does not need to be real-time. This gives you more flexibility on the tech side of things, but it should still be able to deliver updates at a cadence appropriate for your business. For example, marketplaces can have thousands of new products added daily and may require a model refreshes multiple times a day, or even every hour.

Questions product teams frequently wonder about before starting

Hopefully this is a nice recap of everything I’ve been lucky to learn and see firsthand the past year as a member of the Miso Shiba Squad. But since we get to talk to a lot of organizations about their personalization wishlist and concerns, here are some common questions that come up:

1) How are anonymous users handled?

Anonymous users make up the bulk of web traffic so we make sure to optimize Miso for cold-start. For brand new users with zero interactions, Miso makes predictions based on what’s popular and trending. As soon as clickstream events occur, however, Miso will craft a user-embedding in real-time and start personalizing their search results and recommendations.

2) How are new products recommended?

Related to the previous question, brand new products don’t have a lot of impressions, which poses a challenge for traditional collaborative filtering recommender systems. Miso’s personalization engine can intuit the segment of customers that might be interested in a new product, simply based on relation to similar product embeddings.

3) Does Miso have merchandising capabilities available for finer tuning of product rankings?

Your personalization engine should support your merchandising team, not attempt to replace it. To that end, Miso has functionality for filtering, boosting and pinning, and anchoring, to give you full control over the recommendation engine when you need it.

4) Will real-time personalization make my UX feel slow and sluggish?

Nope! A typical response time (generating a search result or a list of recommended products) is under 100ms. We’ve taken measures to keep the latency as low as possible, like having multi-region deployment and load balancing, and horizontal scaling with AWS EC2 instances for the days where shopping demand is surging, like Black Friday and Super Bowl Sunday.

5) What does the pre-launch and post-launch support look like?

The very first thing we do is connect on Slack. That way, our engineering team is always available for questions and technical support. Also, we have a standing biweekly stand up with your organization over Zoom for as long as you need it. We can also help you run A/B tests on your site to verify the performance gains of implementing Miso.

6) Does Miso charge for data ingestion?

Nope! Since real-time systems get better with more data, the last thing we want is for you to feel de-incentivized from sending as much data as possible to Miso.

7) How often do you retrain the model?

The engine training cadence can be configured, but it’s typically once every 4 hours.

8) What’s the general performance lift (and for cold-start in particular)

We’ve been lucky to see conversion lift every time, on every deployment, since launching. Which is why we guarantee conversion lift for every team we work with. Happily, we’ve seen an AOV lift of 5–20% on a touchpoint by touchpoint basis.

Depending on the browser and actions a user takes with their cookies, cookies will indeed expire. If we lose first-party cookie tracking that a returning visitor is someone we’ve seen before, we will return to cold-start state and rebuild the personalization user embedding for this visitor. But when the visitor makes a purchase or logs in, these insights from their visitor session will still be reconciled and saved into their registered user embedding.

10) How long will it take to implement real-time personalization?

In terms of development cycles, we anticipate that implementing Miso’s real-time personalization platform takes around one sprint. A full deployment in production could take a little longer, depending on the length of your A/B test.

Closing Deep Thoughts (aka time for a conclusion)

For far too long, real-time personalization has been an unfair advantage for tech giants who have access to petabytes of consumer data, unfathomable processing power, and teams of data scientists working to make sense of it all. At Miso.ai, we’re on a mission to remove the barrier to entry of implementing behavioral, intent-driven, real-time personalization for e-commerce and content media sites. Our personalization models are designed to empower product teams and can be deployed in your production environment in less than a week, with the data you already have and you can expect to see measurable improvement in metrics like GMV, AOV, and CTR immediately.

If this sounds interesting to you, drop us a line at hello@askmiso.com or fill out our contact form for a personalized demo.