How does search actually work?#
To understand how you outsource search for your app, we must first understand what search is in the first place. Let’s take the example of Google Search, a small search engine you may have heard of.
Most search is reliant on a concept called indexing. If you’ve ever scoured a large scientific textbook, you may have skipped to the end index to find the keywords you needed before flipping back to the associated pages. The same principle applies on the web. Google maintains an index (think, database) with relevant information on most web pages.
A sample book index
How is the index created?
Web Crawlers#
Web crawlers (aka spiders, robots, agents) are programs that scan website links and learn what they’re all about. The crawlers start at a known list of websites before recursively traversing through page links. They pick up relevant webpage information, including content text, title, and descriptions, and add these to the associated index. Because HTML is so structured and rigid, these crawlers can learn everything they need to know about what’s on a site programmatically. If you’ve ever heard of SEO, a big part of it is optimizing your site so that it’s easy for these crawlers to find what they need.
When it’s time to search, Google will apply a ranking algorithm on the index to serve the most relevant results. That algorithm – and the decades spent improving it – is arguably their “secret sauce” and why their results seem to be more relevant than other search engines.
So how does this relate to your app? With Google, the data that you’re searching through is all of the websites (and images, videos, etc.) in the world. But with your app, you likely are trying to build something that lets your users search through your app data. For Netflix that would be shows and movies, for Amazon it would be products, etc.
So just as Google crawls the public web, Algolia indexes the data you provide to it. In both cases, similar complexities arise:
- How do you handle synonyms (e.g. user types ‘satchel’ instead of ‘bag’)?
- What do you do when there are no results?
- How do you handle searches written in different languages or stemming from specific geographies?
- How do you personalize results to certain user types?
- How do you glean context from less defined searches?
And of course, all of this has to be done fast. A typical query on Google will take less than a tenth of a second – and that traverses the entire internet!
Rather than having to scope, prototype, and handle each of these (and many more) bespoke conditions, Algolia makes it easy to address these out of the box.
Algolia’s core product: search#
Algolia’s bread and butter product is website search, which they break into three steps: configure, observe, and enhance. Let’s take a look at each.
Upload/Configure: The first step is uploading your data and configuring the search functionality. In order for Algolia to run their search, you first must give them the data you want users to run search on. Most engineers opt to send their data to Algolia via API, but data can also be added manually or via CSV/JSON files, which I chose.
What's JSON?JSON JavaScript Object Notation)is a lightweight, simple form of storing data. The data is comprised of objects, which are are made up of simple key_value pairs. Due to its readability and versatility, many APIs and web servers output data in JSON format.
Here’s an actual example of a JSON Kindle object from our sample dataset.
Teams can then configure the specific search settings through the UI, with most of those complex scenarios handled in just a few clicks. In this example, we set the properties that users can search on and added a synonym to be applied to user searches.
There’s no shortage of configuration options, including how results are sorted, how tolerant the search is to typos, and how to handle no results.
Observe: Search is only as helpful as the value it provides for end users. Algolia makes it easy to track the ways users interact with your search via usage analytics.
In order for Algolia to properly track product usage and clicks, it needs some telemetry into the ways that users are interacting with your search. This requires event tracking.
There are few ways to add events:
- InstantSearch: Algolia has built and maintains a Javascript library which provides a prebuilt set of UI components (search bar, filter sidebar, item-list) that teams can use out of the box. If teams use these components to build their Search UI, event collection is as easy as setting a property to ‘true’.
- API Client: Teams can also use Algolia’s API client, to which they must manually send events when a user clicks on specific items. These can be sent in bulk or individually, and a sample ‘click’ API call might look like this:
This request specifies that it was a click event, sent to a specific index, by a specific user.
As events are triggered, Algolia logs and analyzes them to create helpful metrics like total searches, trends over time, and even user click properties (which item did users who searched for ‘iPad’ click on?).
Enhance: Enhance provides tools to improve the search experience based on usage and user patterns. While the initial search configuration is unlikely to be the most optimized, Algolia has added AI functionality to auto-improve the experience, including automatic creation of common synonyms, dynamic re-rankings of items, and user personalization strategies.
Example of AI Synonyms
Algolia's other products#
While product search is Algolia’s main offering, they’ve expanded further into product recommendations and discovery. Users might understand what they’re initially looking for, but Recommend augments the experience through tailored suggestions.
There’s a few common recommendation patterns Algolia offers:
- Complementary Recommendations (Products that are frequently bought together)
- Alternative recommendations (Products that users switch between)
- Trends Recommendations (Products that have been searched more frequently)
The recommendations rely on supervised machine learning models that are trained on product data and user interactions. They analyze user events to create tables with interactions, from which they can pull patterns and showcase recommendations. You can read more here.
Who uses Algolia?#
If you’ve never noticed a site using Algolia, that’s probably a good thing! Search is meant to just ‘work’ - you shouldn’t really notice it unless you hit a snag. Algolia is used broadly across startups and enterprises, including some well-known brands like Lacoste and Coursera.
Once a company hits a certain size and volume of offerings, it can make more sense to build the functionality in-house. With search being such a critical aspect of user experience, certain teams begin to lean on owning the algorithms powering their rankings themselves (including most of those mentioned in our TL;DR). There’s a nice read on how Doordash manages this here.