3.9M
Downloads
792
Episodes
Are you an Amazon FBA, Walmart, or Ecommerce Seller, or someone interested in becoming one? The Serious Sellers Podcast by Helium 10 is an unscripted, unrehearsed, BS-free, organic conversation between host Bradley Sutton, and real life sellers and thought leaders in the ecommerce world, where they share the top strategies that will help sellers of all levels succeed. In addition, every week there is an episode of the ”Weekly Buzz” which gives a rundown of the latest news in the Ecommerce world. ► Instagram: instagram.com/serioussellerspodcast ► Free Amazon Seller Chrome Extension: https://h10.me/extension ► Sign Up For Helium 10: https://h10.me/signup (Use SSP10 To Save 10% For Life) ► Learn How To Sell on Amazon: https://h10.me/ft ► Watch The Podcasts On Youtube: youtube.com/@Helium10/videos
Episodes
Tuesday Oct 10, 2023
#499 - Demystifying the Amazon Algorithm: The Power of AI in E-Commerce
Tuesday Oct 10, 2023
Tuesday Oct 10, 2023
Want to crack the code of Amazon's algorithm? Join us in this riveting episode as we sit down with Kevin Dolan, Principal Engineer of the AI Labs program at Pacvue x Helium 10, a mastermind who has scrutinized over 100 Amazon Science papers and run millions of tests to decode the intricate workings of Amazon's search framework. From semantics to lexical search and all the fine details in between, be prepared to gain invaluable knowledge and insights that will transform how you see Amazon's ecosystem.
Pressing on, we dive deep into the expansive landscape of AI with Kevin. We break down the complexities of Amazon Science's information retrieval papers, shedding light on the motivations, implications, and limitations. With a heavy focus on both the relevancy ranking system and the behavioral indicators, such as previous purchases and time spent on a page, we reveal the intricacies of how this system functions and the challenges faced by new products. You'll gain a comprehensive understanding of the balance between query intent and query volume, as well as the impact of micro-actions on a product's early life.
We wrap up our conversation by evaluating the role of Artificial Intelligence in selling success, discussing how semantic search differs from lexical matching, and lastly, looking at the promising future of the Amazon algorithm's evolution. This episode is a gold mine of knowledge and insights that will guide you in navigating Amazon's complex systems. With the help of Kevin's expertise and insights, you'll be better equipped to harness the power of Amazon's algorithm for your own Amazon FBA business’ success. Don’t miss this rare opportunity to gain insights from an expert who has spent countless hours demystifying the workings of Amazon. Enjoy the episode!
In episode 499 of the Serious Sellers Podcast, Bradley and Kevin discuss:
- 00:00 - Check Keyword Indexing With Helium 10
- 06:21 - Impressive AI Amazon Listing Builder
- 10:49 - Investigating the Amazon Algorithm
- 21:10 - How Amazon Search Works in Phases
- 25:43 - How Amazon Chooses Search Results
- 31:17 - Analyzing Product Metrics and Rankings
- 35:05 - Amazon Keyword Relevance and Product Ranking
- 38:44 - Amazon's Trend in Personalized Search
- 42:48 - Keyword-Based Search vs. Meaning-Based Search
- 50:53 - Advancements in Semantic Search Techniques
► Instagram: instagram.com/serioussellerspodcast
► Free Amazon Seller Chrome Extension: https://h10.me/extension
► Sign Up For Helium 10: https://h10.me/signup (Use SSP10 To Save 10% For Life)
► Learn How To Sell on Amazon: https://h10.me/ft
► Watch The Podcasts On YouTube: youtube.com/@Helium10/videos
Transcript
Bradley Sutton:
Today we’re going to talk to somebody who might be the most knowledgeable person in the entire world about how the Amazon algorithm works. He studied over 100 Amazon science papers and runs tests on millions of data points, and he’s going to be educating us on his findings, about what he’s found on things such as semantic search, lexical search and more, including showing a shocking example of proof of how search on Amazon is evolving drastically even now. How cool is that? Pretty cool, I think. Did you know that just because you have a keyword in your listing, that does not mean that you are automatically guaranteed to be searchable or, as we say, indexed for that keyword? Well, how can you know what you are indexed for and not? You can actually use Helium 10’s Index Checker to check any keywords you want. For more information, go to h10.me/indexchecker.
Hello everybody and welcome to another episode of the Serious Sellers podcast by Helium 10. I’m your host, Bradley Sutton, and this is the show that’s completely BS free, unscripted and unrehearsed organic conversation about serious strategies for serious sellers of any level in the e-commerce world. And we’ve got for the first time, somebody on the show a fellow worker here at Helium 10, but one of the more unique ones. He’s not like we’ve had product managers here, we’ve had, like, some of our executives, like Boyan has been on the show before, Adam has been on the show, but he’s got a very unique and one of the coolest roles here and we’re just going to get to know him and you guys are going to be blown away by some of his insights into how the Amazon algorithm works, because he, probably more than anybody else in the entire industry, has done the most research on the subject. So, before we get into the details, first of all, Kevin, welcome to the show. How’s it going?
Kevin:
That’s going pretty good. That’s a lot to live up to, I got to tell you.
Bradley Sutton:
There we go, no pressure, no pressure.
Kevin:
Where are you located? Yeah, I’m out in Los Angeles so I do live near a lot of the other Helium 10 people, a lot of the back view people. We go sailing from time to time.
Bradley Sutton:
Is this where you’re going to be taking me sailing for my first?
Kevin:
time in a couple weeks. Yeah, we’re taking you sailing next week actually.
Bradley Sutton:
Where did you go to school at?
Kevin:
Cornell.
Bradley Sutton:
Cornell, that’s.
Kevin:
Ivy League. It is technically an Ivy League. We get made friend of a lot for being the worst one, but you know.
Bradley Sutton:
Isn’t that where the guy from the office yeah, he went, or something.
Kevin:
Okay, now I was like wait, how do I know he? Classically.
Bradley Sutton:
Andy for them.
Kevin:
Yeah, he brags about it a lot. That’s the kind of reputation that we try to avoid.
Bradley Sutton:
Okay, all right. What did you study there?
Kevin:
I studied computer science Originally physics, but ended up using computers a lot to do physics. Things ended up liking that more, so went down the direction of computer science. My focus was on artificial intelligence, which is why all this new stuff that’s been happening has been so exciting.
Bradley Sutton:
Yeah, so I mean you were into it before. It was actually, you know, hip to be into it Before.
Kevin:
It was cool. Yeah, exactly, exactly.
Bradley Sutton:
And so now, what is your position here at Helium 10?
Kevin:
So I’ve actually been with Helium 10 since 2020. I’ve served like a bunch of different roles, jumping in, you know, helping with individual products, with higher level stuff. Right now I’m serving as the principal architect for the AI labs and this is the AI labs between both back view and Helium 10.
Bradley Sutton:
Cool. So AI labs is like a like, almost like a secret organization inside of Helium 10. What is AI labs?
Kevin:
Yeah well, we try not to be a secret. We try to make sure everybody in the organization knows what we’re doing all the time. But basically you know, even if you’re not in technology right now, you’re hearing about AI. You’re hearing about chat, gpt. You’re trying to figure out ways to use it. You’re worried it’s going to take your job. You know, outside of technology, a lot of people don’t become aware of these technological shifts, but sometimes you get these breakout technologies, and AI is one of them.
This whole stream of research really began in 2017 with the release of the first Transformers paper. It started to take off really, really hard in 2020, when some of these new techniques started to get better results than any other techniques. But when last year, openai released chat, gpt, there was this sudden explosion and suddenly we’re seeing AI models that can do things that, say, 10 years ago, people would have never expected computers to be able to do, and so you’re seeing a lot of new products come out, a lot of new features. People are excited, people are scared.
I tend to be more on the conservative side when it comes to AI. There’s, I’m excited about it, it can do a lot of really cool things, but I think it can’t do a lot of the stuff that people say that it can. Right now, the goal of the AI labs is basically to figure out what’s hype and what’s real. We’re trying to figure out what of these AI technologies we can use within Helium 10 and PacView products to make our tools better, to make our sellers have a better life, and we’re also trying to figure out ways that these AI technologies can change the ecosystem. So what’s going on at Amazon right now that might affect the way sellers sell on Amazon, that people buy things on Amazon, and that’s actually the part of the job that’s really really exciting to me, because it forces you to predict the future and that’s just really fun.
Bradley Sutton:
Yeah, it’s really cool how Helium 10 and PacFu is embracing AI. We have a whole department here dedicated to it. We just hired another executive who was leading up AI at Microsoft of all places. Obviously we have dedication, but somebody might say, wait a minute, like I barely have seen Helium 10 come out with anything AI. This is what I think is cool, because I remember back in the day I obviously have worked here at Helium 10 longer than you but five years ago when we were a tiny team we weren’t number one, we probably weren’t even number two.
It was like Jungle Scout was number one, maybe Viral Launch was number two, and then Helium 10 was kind of a newer kid on the block and because of that we had to be cutting edge like nonstop and try and be the first, and it was like a space wars, like who’s going to launch this reverse ace in first and who’s going to launch an auto responder email, and then sometimes we just rushed to get something out and it wasn’t maybe the best, but in those days, like being the first at something was super important in this growing industry. Now for a few years, since you’ve been here no coincidence there but since you’ve been here we’re number one in the game. So it’s like you know what. We don’t need to be first at anything. Like let’s be. When we launch something, let’s get it right. So we were definitely not the first to launch AI for listing building. But, man, it’s really amazing Like our listing builder tool now can do multiple languages and multiple marketplaces. I threw in words into a Spanish listing but then my inputs were like in English and even through some Portuguese in there, but then the AI knew that this is for Amazon Mexico and then took my keywords and it made a complete Spanish listing. I mean, it’s just like it even blows the Amazon AI listing builder like completely out of the water, let alone anybody else in our space here.
But we had long story short. We’ve got this whole team that’s working hard and making sure that we’re gonna do that. We’re doing the right things. But at the end of the day, you were talking about the. You know how AI is integrated into the search algorithm. We’re definitely gonna go deep into that, but just a preview, guys, like I don’t know.
You can tell me what you think, kevin, but in my opinion, we can read all the scientific documents we want, like you have, but at the end of the day, there is nobody on this earth not even Amazon workers, or not just one Amazon worker who could just tell you the exact formula of exactly how the Amazon search algorithm works, because that’s not the way it works. It’s not something that you can just turn into a mathematical formula. So just guys, you know what we talk about today is gonna be based on, you know, a lot of research and things, but you gotta remember that, at the end of the day, it doesn’t matter if we’re coming up with something, or somebody else out there who read some scientific documents is coming out with something. It’s speculation, you know, and we can you know. I think what we’re gonna show today is is that Kevin’s speculation probably is better than anybody else’s, since he’s read more. But are you with me there on that kind of like postulation?
Kevin:
Yeah, I mean exactly like you know. I think whenever a new technology like this comes out, people get excited about it. They wanna use it. They wanna release products that you know claim to use AI, even if it’s not really that big of an important component to it. You know, I recently heard from one of my friends who’s in venture capital that he’s, at this point, tired of hearing AI pitches. When somebody comes to him and says, all right, our company is, we’re doing X, but we’re using AI, basically everybody just rolls their eyes and I think the reason is because AI, at the end of the day, is a tool, it’s a technology, it’s not even a feature and it’s definitely not a whole product.
I think when the dust settles, the hype dies down and this becomes integrated into day-to-day life, you’re not gonna hear about it as much. It’s just gonna naturally be a part of so many systems that you don’t think about it. Just the same way that you know, when you’re using Amazon as a buyer or as a seller, you’re not thinking about what databases they use on the backend or what fraud detection techniques they’re using. You don’t have to think about those low-level details because they’re just part of the system we’re about to get to a place, hopefully in the next couple of years, where these things just become more commonplace, and that’s a lot of the approach that I take when I develop technology is that I look at all of these things as tools that can be used to accomplish things, but at the end of the day, we still need to accomplish things that our users want to accomplish.
Bradley Sutton:
Yeah, now you know we’re about to talk about the extent of your research and how ridiculous, how many hours you’ve spent, you know, investigating the Amazon algorithm and stuff. But you know, just like we said, nobody, not even in Amazon, can, just you know, know instantly what the, how the Amazon algorithm works, how search works. So let me just ask you, like, why did you even do all this work in the first place?
Kevin:
If you knew that there’s.
Bradley Sutton:
you know, the ceiling is not even a full knowledge of what’s going on Like, so why even put all this work into it?
Kevin:
Yeah, I mean, you know we are, at the end of the day, building products that help sellers sell things on Amazon. So the more that we understand about how Amazon search works, the better we’re able to do that. Yes, we’re never gonna be able to understand the whole thing, I would say. Within Amazon, it’s a large functional engineering organization, so the entire system is broken down into subcomponents. Some people are going to be experts on individual subcomponents. Some people are going to be experts on how all of the different subcomponents connect to one another. But, like you said, no one person really knows everything. And even if there are people at Amazon who can really say that they understand all the subcomponents at a deep level, they’re still not going to understand all of the emergent properties that come from the system.
Whenever you have a system that’s so complex that so many different people are using for so many different purposes, a lot of new behaviors start to come about. You get behaviors that come from the fact that people want to list on Amazon so that they rank more highly. You might not be able to predict ahead of time what that’s gonna look like. You might not be able to predict how buyers are going to change what they type into the search box as you change, how different search results come back, and so I think it’s something that, whether you work for Amazon, whether you build tools for Amazon, whether you’re an Amazon seller, or even if you’re an Amazon buyer, I think it’s important for you to understand what kinds of things are happening, because they give you hints to better understand how to interact with the ecosystem.
Bradley Sutton:
Okay now, what did your research entail? To what extent have you done that leads up to you giving us this information that you’re about to in this episode?
Kevin:
A little bit, a little bit. So the first place I went was the blogs, which I think was probably a mistake. They’re written for a different audience. I understand that they’re gonna be non-technical.
Like blogs from Amazon or just like blogs from people in the industry, industry blogs and blogs from Amazon as well, but in general they’re written for non-technical audiences, and so I understand there’s gonna be some loss in translation when you get there. What I was astounded by was just how wrong they absolutely are. A lot of the articles make things up. A lot of the things point to research that isn’t likely to be part of a production system. A lot of them talk about search techniques that were popular 20 years ago and we’ve moved well past those technologies, and so I think it is generally safe to assume that if you’re reading it on a blog, you can take it with a grain of salt. There might still be some useful information there. It might still be relevant when you are writing your listings, but at the end of the day, it’s not canon.
Amazon operates a publication that they call Amazon Science. They have a number of programs internally that lead to this, but this is basically where they release a lot of their public facing academic research, and there’s a section within Amazon Science that’s on the subject of information retrieval. It’s one of the biggest sections. That’s the academic term for search, and I went through and basically looked through the hundreds of papers that they had listed for publication there. I selected about a hundred of them that were gonna be relevant towards giving us hints about how Amazon how many pages does each of these have?
Bradley Sutton:
It depends. It depends these hundred that you read.
Kevin:
Yeah, I mean. Some papers will be as short as just a couple of pages, like two or three. Some will be 20 pages long. Some have a lot of appendices, a lot of formulas. When you do academic research, you get really good at skimming papers for the important parts and making sure that you’re not wasting time reading stuff you don’t need to read. But it’s a skill in and into itself for sure.
Bradley Sutton:
Now the first time. I haven’t read many and I think you by far have read more than anybody else in the world. These aren’t written by the same people, so probably even county Amazon employees, you’ve read more of these than anybody else. The first time it came on my radar I was looking up. I found a patent that Amazon had filed for something about search, and that was the first time I was diving in. I was like my goodness, this is interesting stuff.
Like a lot of it was way over my head, since I’m not a data scientist and using language, but then it allowed me to understand like even parts of, like what we call the honeymoon period. They call it like cold start and stuff like this. It was just fascinating to read. But then I found out later that Amazon is just publishing left and right all these papers, like you said. But like, first of all, why are they doing this? And then, second of all, correct me if I’m wrong but just because they publish a paper on something, it doesn’t mean, like you said, that they actually have what’s in that paper in production in Amazon or is even imminent to hit production. Yeah, exactly.
Kevin:
Yeah, I would say that there’s a lot of different reasons why companies release academic papers and actually up until about 2020, amazon was very careful about the information relating to their system that they released to the public. They might release a patent, but patents are incredibly vague and the reason you release a patent is specifically to prevent other people from being able to do that or, at the very least, prevent other people from suing you for doing something similar. But when you release academic research, you’ve definitely got a different set of motivations. There. You run the risk of, say, a competitor adopting the same techniques something that’s part of your secret sauce becoming commonplace, and so you do have to weigh that against the benefits. But there are a lot of benefits to doing this. If you look at other companies like Google, google, unlike Amazon, came from academic research. The founders of Google created the PageRank algorithm, which was originally called the Backrub algorithm.
Bradley Sutton:
I bet you there’s a great story behind that there.
Kevin:
So you know, like Google’s approach was academic driven from the beginning and as a result of that, the academic research that goes into web search is probably a decade ahead of what you see in e-commerce search. When you release these papers as part of your company, when you get them out there, at the end of the day, what you’re doing is you’re sparking innovation on the subject. You’re sparking innovation within your own company because you’re able to recruit the best talent. If I’m an engineer or I’m a data scientist or I’m a researcher, I might now want to work at Amazon because I know that there’s a chance that I can release a paper that’s gonna be really important, that’s gonna be really great for my career. I know I’m gonna be working with the state of the art technologies and like that’s really exciting, so you’re able to get better talent that way. It also allows you to work with people who are in academia.
So one of the challenges that e-commerce search has faced in academic research on information retrieval is the availability of data, because Amazon doesn’t want to release to just anybody their search volumes, their search history, what products people are clicking on, and so it’s a lot harder for somebody who works at Cornell University to do research on that subject. Amazon started a program called the Amazon Scholars Program, where somebody who is perhaps a PhD candidate or a university professor can kind of be embedded in a team at Amazon to help them develop something, and in many cases conditions of that would be that they get access to data and they’re able to release important papers, and so at the end of the day you get these papers out there. They help you develop new technologies and new techniques, but it also sort of fosters this broader ecosystem of research. That happens so that just in general in the world there’s better knowledge about how to do things. It’s worked really well for web search. E-commerce search has been a little bit slower to do this kind of thing, but they’re catching up.
Bradley Sutton:
Yeah, All right. So, guys, first takeaway here is we could have the person who’s read the most scientific documents here in the world. We could have somebody who just reads one scientific document, but what you can’t do is just take one of these and say, okay, because of what I read here, this is proof of what’s going on in Amazon. You know I mean, otherwise we’d be releasing blogs every week with Kevin as a byline talking about what he’s learned from there. So, again, just keep in mind that not any one of us can just take one of these documents and explain what is happening on Amazon. But, that being said, there’s a lot that we can take away, both from what you’ve researched in these documents and what you. Obviously, being at Healing 10, you have access to more data points than almost anybody outside of Amazon, and so you can actually study behaviors and trends and things like that. So let’s just start at a very basic level. How does Amazon search work today?
Kevin:
Yeah, I mean there’s some things that we can tell from the papers that are kind of like canon knowledge. One section of almost every paper is usually like a introduction discussion session where they talk about why this problem exists, what they’re trying to solve and sort of what the context of the problem is. In these it seems to be pretty well accepted that search happens in three high-level phases. The first phase is query understanding. So Amazon is gonna be looking at the query that was searched. They’re possibly gonna be looking at your past queries, your past purchase history, to try and better understand what it is you’re looking for. There’s a matching phase, phase two which is basically looking at Amazon’s catalog of billions of products to try and find a smaller set of products that are likely related to that search. So now we’ve whittled it down from, say, a few billion potential products down to a thousand products. And then the final phase, which is really the most important, is the ranking phase. The ranking phase is where it determines what order to show those thousand matching products in. This is, I think, where Amazon has the best opportunity to use the more advanced technologies to really precisely understand what somebody’s looking for and what somebody’s offering and to match the absolute most likely things that somebody’s gonna buy into the top of the search results. But even as you go further down the list, ranking is still important.
Amazon talks about a lot of situations where the priorities and goals that Amazon has for ranking are actually different than what you might think. You might think that when you’re making a search that you want to show in the number one search result the product that the person’s most likely gonna buy, but that’s not always the case. They use this example a lot in their research engagement rings. So I don’t know about you. I don’t know many people who are buying diamond engagement rings on Amazon. If you’re going to Amazon, you’re typing an engagement ring, you’re most likely gonna be looking for something like cubic zirconium, something a little more affordable, and so you would think that when you search this, the top results should all be affordable diamond rings. The reality is that if people see that, they think something is wrong with Amazon because they expect to see diamond rings at the top, and so in certain cases, amazon has goals that are more related to user expectations than quantitative optimizable goals, and so it’s a really complex system. But ranking is really where a lot of the secret sauce is.
Bradley Sutton:
Okay, now how does AI play a role like, for example, in misspellings? Or is this just something that has existed, where machine learning is used and it just learns to try and predict buyer intent based on most common misspellings? And then, how does Amazon work in that regard?
Kevin:
I mean misspellings is an interesting case, so that usually falls into the query understanding bucket. So somebody types in a query. There are a lot of existing techniques that you might call AI, you might call machine learning, but they definitely aren’t the same thing that’s being referred to as AI that everybody’s excited about right now to do a spell correction. It’s a pretty well researched area and so when you type a query into Amazon, one of the first things that they’re gonna do is run it through a spell check algorithm to try and figure out if there’s some obviously misspelled word that they can correct. And that’s usually gonna be like one of the first steps in query understanding.
Bradley Sutton:
Okay, now you know the holy grail or goal of any Amazon seller when they’re launching products or when they’re trying to get sales is hey, I need to get on, quote unquote page one of Amazon search results. You know, this is nothing new, you know, you’re a blogger, you’re an SEO person, you want to get. You know, page one on Google results. You know, but on Amazon, how? In general? You know we’re going to simplify this down before we go deep into the weeds here, but how does Amazon choose what to show? You know, first in search results?
Kevin:
Yeah, I mean this. This is a really complicated question. Like, like we said earlier, amazon is a modular system. You know, there are going to be a lot of different teams working on a lot of different subsystems. These subsystems are going to be interacting in different ways, but in general, there’s a broad category of algorithms that work really well with these types of modular systems, and these are called learning to rank algorithms, and basically the idea is that you set some set of goals that you want.
You know a goal might be I want the product that the user is most likely to buy to be at the top. A goal might be I care about my long term relationship with the user, so I’d make sure not to show them things that they’re going to return. You know I care about my long term relationship with sellers, so I want to give new products a chance, and so the system’s going to have a lot of different goals that it’s able to juggle, and a learning to rank algorithm will take a series of signals and try to put things in the best order so that they can achieve those goals, and this also allows for engineers at Amazon to be modular in how they define those signals, so I might define relevant signals in terms of like okay, you’ve searched for a query with these particular words in it. I know that those words are in the title, so you should probably rank that a little bit higher. However, another signal might be everybody who buys this product seems to return it, so that’s a bad thing, and these different signals can go together to they can be.
These signals can be mixed together to basically come up with a ranking that best accomplishes those goals, and it’s really important to stress that these signals can be defined by, like a lot of different engineers. You might have data scientists who are working in relevance factors as to like whether or not a particular listing means the same thing that somebody is expressing in a query. You might have somebody who’s focused more on behavioral signals. We see that things that people have bought in the past is a really important signal for Amazon. If you search for a common query, something as simple as like pressure cooker, amazon knows which things people have bought when they’ve typed in pressure cooker, so that might actually end up being the dominant signal. But the learning to rank algorithms basically allow you to, instead of sit there and like manually tune all the rules that land at some particular ranking. You get to let the system sort of figure out how to do it.
Bradley Sutton:
Okay, what about when we’re talking new products? All right, you know that’s just general, general how the search works. But then you know, especially when it, when it comes to to you know a history of interactions. You know, like, what happens on a mature product. You know Amazon can easily know what to prioritize because you know they have how people have clicked, how long people have stayed on a page or how they scroll. I mean, there’s like billions of data points. They have, after listings, been out there for six months. But going back to, you know, like what I saw in that one document about, like, you know what was called cold start. You know problem how does Amazon determine relevancy in things for a brand new listing? You know that doesn’t have this history.
Kevin:
Yeah, so it’s a problem. It’s a very active area of research. Of the hundred papers I read, at least a dozen were about the cold start problem. So I would say this is definitely something that is a focus for Amazon. It’s something that they care about really deeply and want to solve, and it’s also something that they haven’t entirely solved.
The cold start problem basically says that because for common queries, behavioral signals dominate and are so predictive of what somebody is likely to buy, the rankers are typically going to show products with deep history above products with, you know, a shallow history, like a new product. If I’m searching for a pressure cooker, it’s going to show me the pressure cooker that people usually buy. If some company comes and makes a better pressure cooker something that really just you know completely blows this the existing ones out of the water, amazon’s not necessarily going to show that to users because it’s got no behavioral priors. So solving the cold start problem is really important and there’s a bunch of different ways to do that. There’s a bunch of different techniques. Some of the techniques you’ll see in the literature are called bandit optimization, which uses like a gambling game as a sort of analogy for how you learn about you know whether you should continue exploring new options or exploit the options you have, but I would say most of the research and the papers that seem the most important lead to this idea of using details about the listing to predict behavioral priors. The idea is basically that you look at things like the quality of the listing, the seller, the image, the title, and you try to analyze them to a degree to where you can estimate what you think somebody might be likely to click the product.
At that point. You start to use those to intermix new products with the existing products and then you can measure that behavior against them. So you could say all right, I’ve started to rank this new pressure cooker high, even though I don’t have much priors for it, and it seems like people are really excited about this thing. It seems like people are buying this. So now I’m going to update my estimate and say like, okay, your optimistic estimate is really good. If I start to show it to people and nobody wants it, or people buy it and they start to return it, then I’m going to start to rank that item lower.
I think this is what leads to what you and the industry refers to as the honeymoon period. It’s this idea that there is a brief window of time when you launch a new product where Amazon is going to be ranking it more favorably than if it had a long sales history of poor performing sales, and so I think this is something that’s definitely an emergent property of the system. It’s not necessarily something that Amazon sits there and says, okay, the honeymoon period is two weeks, it’s definitely not something like that, but it is something that may emerge from the system as designed.
Bradley Sutton:
Yeah, so, like you know, there’s in the industry. You know people talk about query intent versus query volume. You know, query volume, search query volume. You know don’t always just go for hey kitchen utensil has, you know, 50,000 search volume. And that’s what I need to be focusing on when I launch my product or when I’m making my listing optimization, as opposed to something with only 600 searches. But that’s super hyper relevant, like aluminum spoon for boho decor or something like that, you know.
And so I found that in early listings and I’m going to talk about this more in episode 500 of the podcast, where I talk about the Maldives honeymoon method, about how micro actions mean exponentially more in the, you know, honeymoon period, or whatever you want to call it first few weeks, cold start, whatever of a product, like you can drastically change how Amazon views your product. Now the first part is you guys obviously have to have your listing optimization down. You know, like I can’t have a water bottle, but then the copy of the listing just talks about kitchen utensils. It talks about, you know, I don’t know podcasting equipment and just random stuff, and then Amazon just miraculously is going to figure out that I’m talking about a water bottle. No, you’ve got to be super, make sure your listing has all the potential keywords so that you give Amazon something to start with. And the thing is, you could do all of the right things. I’m talking about this in the next episode, how we’re working on something so that when people are using listening builder, that they have like a scoring system that’s kind of based on best practices, like how many times do you have the right? Do you have all the right keywords, are you indexed for them? How many times do you have it and do you have it in the right parts of your listening? But at the end of the day, you could have a perfect score, if there ever was one, and still Amazon might not 100% know about what the product is, especially if it’s like a newer niche, like if it was water bottle, probably from day one. Amazon has everything ready. As long as you have a great listening, amazon knows exactly what your product is.
I’m gonna talk about how I launched or I did some dual testings on this coffin shaped coffin shaped bath tray, which maybe only one or two people has ever sold this on Amazon and not many units. And when I check in Helium 10, there’s a way to check how Amazon views your, which keywords Amazon views as relevant. It was obviously confused Like coffin is probably one of the most main words because that’s how it’s shaped, and it was like way, way, way down the list as far as what Amazon thought was relevant and it had like these generic terms that had great search volume. It was funny because I could rank easily for it on a keyword that another product probably wishes they could rank for. It was something like bathroom decor or something that had like 200,000 searches.
And just because Amazon rated that as the most important keyword to my listing, I was already ranking in the first six pages, even though there’s 30,000 other products that are indexed for that keyword. I was on page six and, you know, temporarily I was even on page one without doing anything, while other people are getting purchases for this product and they can’t get on page one. But does that really do me any good? No, it doesn’t do me any good, because that’s not necessarily what my product is and nobody’s gonna search that keyword and then buy my product. And so you know, sure enough, you know the keyword dropped down, but it shows you guys that you know you’ve gotta be thinking about the quality of the keyword and how relevant it is, especially in the beginning. So that’s kind of like your training Amazon to understand how to interact with that. Now, how does like advertising, you know, play a role in all of this in your opinion?
Kevin:
Yeah, I mean. Well, I think what you just said is really kind of key across the board when it comes to writing your listings or using advertising. The optimization effort that you put into your listing to basically describe your product accurately, to target keywords that are more intent focused, that people are more likely to use to purchase your product. I think that that’s really good advice. I think that that’s gonna apply across the board. We do know that when somebody performs a search and ultimately buys a product through a sponsored listing, that behavioral signal still counts. It’s a technique that I know a lot of people recommend for how to sort of seed your behavioral signals.
At the end of the day, if people are buying your product, if they’re not returning it, if they’re leaving good reviews, those are all gonna be things that lead to better organic rankings.
We do see some evidence that the paid signals count a little bit less than the organic signals, but they still count, and so I think that advertising is definitely a critical part of any product launch. But I think, just like you were talking about with organic optimization and search engine optimization in general, your focus in the beginning really needs to be on high intent queries rather than high volume queries. If you are showing up on page one for a search query that has a lot of volume and people are not likely to buy your product, that’s actually likely going to damage your organic search rankings way worse than if you had ranked highly for keywords that are very relevant to your product. Because what’s gonna happen is Amazon has now shown your product to people and people have rejected it, which Amazon is gonna be considering a negative signal, and that could start to affect you on other keywords as well.
Bradley Sutton:
Okay now, a lot of the documents that you read might have been two, two, three years, or some of them might be two, three years old already, but the ones that you’ve read that are maybe published this year. Have you seen any trend that makes it kind of obvious that Amazon might be moving in a certain direction?
Kevin:
Yeah, I mean there’s a lot of stuff on the cold start problem and I think that there seems to be a narrowing focus on this approach of estimating behavioral signals based on the listings and I think you’re seeing some sort of corolling of resources into that direction. There are a few papers on personalized search and it’s kind of always been in the background. I think there’s some evidence that Amazon might be re-ranking products based on individual preferences. If you’re a bargain hunter, I might be showing you different products than if you’re the type of person who likes to buy the more expensive, more luxury products. There’s definitely some query rewriting that might be happening based on recent search history. So if you’re looking for a lot of kitchen utensils versus if you’re looking for a bunch of hunting and camping equipment and then you search for something generic like knife, amazon’s gonna be thinking about what you’ve recently searched for to try and understand what types of products to show you. So there’s definitely some stuff going on there. There are always papers about UX improvements, little things that they can change in the site or big things that they could change in the site and how people search. I don’t think that the general UI of searching for something in a text box and then seeing either a grid or list of listings. I don’t think that goes away anytime soon. I think that’s a great way to look for products. You’ll see a lot of people who are pumped about large language models and AI talk about like conversational search or shopping assistance and things like that. I’m not too excited about those. I don’t think that those are gonna really really change how people do look for products.
But one area where Amazon may start to invest in is result explicability explaining to you why a particular listing might be relevant. They already do this to some degree. Like when you type in a search, they’ll highlight any of the words that are from your search that are in the listings to help you better understand it. With LLMs and other generative models, you can start to explain in more natural English like hey, this one might be really good for you because X, y and Z, so you might see some UX changes there.
There’s a lot of work on neural rankers, so this is sort of a technological detail of how they choose which products to rank higher than others, rather than fundamentally changing the way learning to rank works. So it’s not super relevant but I think probably the most important and most impactful area of research is this space called semantic search. Semantic search is basically looking through the listings and trying to find listings that are most relevant to a particular query, based on the meaning of the words rather than the literal words that are in both the listing and the search.
Bradley Sutton:
So give an example of like the counter to that would be lexical matching. How is semantic matching different?
Kevin:
Yeah, so within, I would say that today Amazon is still probably dominated by lexical matching. Lexical matching is the historical winner for search. It’s become less important in web search but it’s still a major factor there, and e-commerce has sort of fought this battle between lexical and semantic search for the past six years. In a universe of lexical search, you are trying, as a person who is searching, I’m trying to guess which words would likely be in a listing for a product that I’m looking for, and it requires you, as the searcher, to have a skill set for searching. The ideal in infamoration retrieval is that you don’t have to have a skill set in order to find things. You just type in here’s what a situation is, here’s what I want, and the search engine brings you back exactly what you’re looking for. If you really wanna get that, if you really wanna solve that you can’t use a keyword-based approach that as your only solution you need to really start thinking about the meaning of those words you’re typing.
Bradley Sutton:
Now, what are some things that you’ve dive a little bit deeper into to what you’ve found in the documents, as far as if you can kind of say where Amazon might be going with the search, because, like you said, this is something that’s already been kind of existing in the Google and regular search engine world, but it’s a little bit been slower for e-commerce places like Amazon to adopt. But where do you think we’re going with this?
Kevin:
Yeah, I mean I think that it generally seems to be accepted as true that using new advanced AI-based technologies to match products would give people better listings. It would give better rankings to it would produce better rankings in the matching the user’s intent. I think we have really really strong evidence that that’s the case. The problem is that they tend to be slow and expensive, and so a lot of the research today has focused on using AI during ranking. So, instead of processing a advanced AI model across all the billions of products on Amazon, I could process it across just the top 1000 that match your query and then I could find the exact product that you’re most likely looking for.
And I think it’s pretty safe to assume that Amazon is using some type of semantic analysis at the ranking stage. What’s a little bit less clear is what they’re doing at the matching stage. I have some examples to suggest that, most likely a technique that Amazon discussed in a 2019 paper called DSSM. I have some evidence to suggest that they are at least using that for matching. In cases where there aren’t a lot of lexical matches for a particular search, they may be using different techniques, but I have some evidence of this and I think that it’s definitely safe to say that they’re using semantic search at the ranking level to make sure that the top results you see are exactly the thing you’re looking for.
Bradley Sutton:
Now is, that is what that one. Was it you who found that, or was it Adam Shabazz? About the the noodle camera.
Kevin:
Noodle camera, yeah yeah, the noodle camera I think this is. This is one of the strongest pieces of evidence I have for Amazon using semantic search string matching. So I wrote an algorithm that I called a Adversarial search generation. So basically the idea was generate searches that are Are phrases that somebody might use, that kind of like make sense from a language perspective but don’t have a lot of lexical matches. And one of the search terms that the algorithm came back with was noodle camera.
Noodle camera is not a thing. Yeah, I thought perhaps you know it was a thing, but I Googled it. Nobody calls this a noodle camera. Most of the results that come back are endoscopes. Endoscopes are also called snake cameras, and I have two main Explanations for how this result is coming back for noodle camera. The first is a query rewriting explanation. It is possible that somebody who didn’t know the name of an endoscope might call it a noodle camera and they would search for that, not get any results, and then later search for snake camera, later search endoscope and end up buying that. So Amazon might be behind the scenes doing some kind of query rewriting.
Another explanation is semantic. A snake and a noodle are similarly shaped they’re long, cylindrical objects and so, going down that direction. That would seem to suggest that they may be doing some kind of semantic analysis of the words that you’re searching to try and find something that At least Resembles the user’s intent. We tried a couple variations of that that were also interesting. So when I searched for eel camera Figuring eel and snake they’re similar enough it came back with an entirely different set of results. It came back with underwater cameras, and I think that’s really interesting, because underwater cameras and Again, in this case, when you look at the listings, none of them talk about eels, but eels are similar to fish. They do talk about fish, and an underwater camera is exactly the type of camera that you would use To take a picture of an eel. And so I think that, at the very least for cases where a Search doesn’t burn back a lot of lexical matches, there’s a good chance that Amazon is augmenting those matches with some kind of semantic techniques.
Bradley Sutton:
Yeah, guys, you know I dug into this too and anybody can do this Everybody, you know, if you’re not in your car but if you’re at home, you know, just type in Noodle camera and then you’ll see what he was talking about. Like, like, there’s one brand here that that is kind of like could be a Misspelling of noodle called new e. Like it has a camera, and it’s actually interesting, when I, when I entered that into chat GPT, that was one of the first things that came up. He’s like oh, some people miss misspell noodle camera for this new e brand. It was really weird. And so, you know, amazon picked up on that. But then most of these are these kind of like endoscope cameras and so I took one of these. You know, just alright, I didn’t take just one of them, I took, like most of these endoscope cameras, and then I first threw it into index checker inside of helium 10 and it’s interesting to know the.
You know it says that noodle camera is indexed, and that rightfully so, because it obviously shows up in the search results. But usually when you have a phrase that is indexed, the individual words are indexed as well. But then I broke out the word noodle, and noodle is not indexed while noodle camera is. And then I went to the Ajax page a lot of you guys know what, that Ajax page where I can look at the whole back end of a Listing and then if I type in here, there is no noodle written in the front end, in the back end, in Amazon’s you know features. Nowhere is noodle in this listing or any of these other listings that are on this page. And yet clearly it is indexed. For noodle camera. Amazon is showing it in the not only is it index, is ranking you know for it. And so that you know, I don’t want to like scare people you know, or people you know start to think.
Wait a minute. You know, if Amazon goes full semantic search, you know, then tools like cerebro and helium 10 or listening builder and optimizing your listening are not Important and it’s gonna be out of date. No, there is always gonna be a need to To have to understand the right keywords and to build your listening. Otherwise Amazon won’t even know how to do a semantic you know match to your product if it doesn’t have a baseline. And so you’ve got to be all you know, you’ve got to be traditionally indexed, lexically indexed. That’s a thing you know for the right keywords for this. You know Semantic to take the the next step. But what? What you know? What should sellers do you think keep in mind over the next couple years? You know what is this move towards semantic searching mean for Amazon sellers and how they optimize our listening.
Kevin:
Yeah, the technique that I was just talking about, dssm. It’s a fairly old technique. The paper Amazon released was in 2019, but the technique itself goes back a few years before that. There are newer techniques that are closer to these families of exciting large language models.
In particular, this past May, in May 2023, they released a paper where they created a small Burt model, which is a much more advanced type of semantic search, and the challenge generally would be when running with one of these birds sand for anything and it stands for my names are bi-directional encoder, something transformers, yeah okay, so not just the dude who discovered it is an acronym I can’t remember the, the whole name, but anyways, they use this Burt model to basically Create a set of what they call embeddings to index listings in a semantic way and then perform search over that. This was something that we knew we could do for a long time, but they found a clever way to do it very fast, and this is going to be the difference between whether or not they use these technologies in production, because speed really matters when it comes to searching. You know the difference between Amazon getting you your search results back at a hundred milliseconds and ten seconds would be. You go and shop somewhere else, and so, even if you’re getting better results back, you need to get the results back fast, and so as we start to see new techniques that can do this deep level, deep level of query analysis, deep level of product analysis as we start to see these techniques that can be run more quickly, we’re going to start seeing them more In actual Amazon search results. I think that within three years, you’re probably going to be in a world where semantic based search actually starts to dominate Lexical search results. You know, that’s that’s my personal guess. You know who knows what it really looks like in the future. But I think at the end of the day, it is when things are heading and I think that does change the way people need to write their listings. But I also think that’s a win for pretty much everybody involved.
I think that buyers don’t like keyword stuffing because in order to get to the information they want about a listing, they’ve got to go through a bunch of random words. I don’t think sellers like it because it’s tedious. It’s a lot of work. I’ve talked to a lot of sellers who, you know, describe this as a major pain point in their jobs is trying to make sure their listings contain all the literal keywords that somebody might be searching for, and I don’t think Amazon likes it either because it makes Amazon look sketchy.
You see all throughout the literature how Amazon really tries very hard to make sure it doesn’t look like a flea market. You know they don’t want the search results to look like the results that you get on a site like Alibaba, and so Everybody’s incentivized to get rid of keyword stuffing. But because of the limitations of the way a search is done today, it’s basically an inevitable emergent property, and so I think as we start to see semantic search become more viable. As we start to see it get faster, as we start to see Amazon upgrade the hardware that runs their searches, there’s a good chance that semantic search is going to start dominating over traditional lexical search, and so I think that leads to a world where sellers, instead of trying to play these different keyword games, should just be describing their products as accurately and compellingly as possible, and I think that’s just a win for everybody.
Bradley Sutton:
Yeah, all right, guys. So you know we’ve. You know if your head doesn’t hurt right now that then you probably weren’t paying attention, but but a lot to take in. You guys might need to Re-listen to this. There’s a lot of exciting things happening on Amazon.
That’s one of the cool things about being in this industry. It’s. It’s not. It’s not something that stale or or something that you know you can master, and then you know you never have to learn another thing ever again because you’re your master at it. No, you got to keep you know studying and and keep you know seeing what’s going on, and that’s that’s what sets you guys apart.
You know the good Amazon sellers from the the ones that might fall off, is you know they just make their listening and then you set it and forget it and then never try and figure out how to, how to optimize or, you know, remain at the top. And so you know, those of you who listen here to the end of this episode, you’re probably one of those ones who’s like, nah, I got to make sure I’m at the top of the game and and continue to develop, because Amazon is is always on the cutting edge of different things. So you know. Thank you, thank you for joining us on here and any last words of wisdom you can share with Everybody out there, or something to you know, like things that we can expect from the cool AI lab, secret, secret Avengers team here at at Helium 10.
Kevin:
Yeah, I mean you’re you’re gonna see a lot of things come out. I mean we’re gonna be releasing features within Helium 10 with impact view, that use AI. These are gonna be things that you know, like I said, become commonplace. I Think the most important thing is that we’re paying attention to what’s happening in the industry. We’re aware of the trends, were aware of the new technologies and we’re gonna make sure that our sellers have the best chance of success.
Bradley Sutton:
Awesome, awesome, well, kevin, thank you for joining us, and you know I wouldn’t suggest reading in a hundred more scientific Papers that I think that’s too much. Your brain’s gonna explode soon. But we appreciate all the work that that you’ve done so that we don’t, so that we don’t have to go out and read all that stuff, and and we’ll definitely invite you back next year and it’ll be interesting to see you know where we’re at as far as the Amazon algorithm goes. So we’ll see you, yeah.
Kevin:
I mean, at this pace it’s gonna be completely different. So thanks for having me, Bradley.