<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom"><title type="text">Blog posts by Junte's Blog</title><link href="http://world.optimizely.com" /><updated>2024-08-26T16:13:51.0000000Z</updated><id>https://world.optimizely.com/blogs/juntes-blog/</id> <generator uri="http://world.optimizely.com" version="2.0">Optimizely World</generator> <entry><title>How to Create AI-driven Chatbots with Optimizely Graph</title><link href="https://world.optimizely.com/blogs/juntes-blog/dates/2024/8/how-to-create-ai-driven-chatbots-with-optimizely-graph/" /><id>&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Alan Turning in the 1950s hypothesized that a computer program could interact with humans. In 1966, the first chatbot was created at MIT called ELIZA, which &lt;a href=&quot;https://web.njit.edu/~ronkowit/eliza.html&quot;&gt;acts like a therapist&lt;/a&gt;. Since then, we see that chatbots have evolved from using rules-based natural language processing to Conversational AI, including the use of generative AI (GenAI) with the ChatGPT boom.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&quot;&lt;em&gt;Hi Siri, can you find me a pizzeria?&lt;/em&gt;&quot; &amp;ndash; this is an example of Conversation AI in action. It has become a buzzword with the advent of increasingly more powerful computing power to process huge amounts of data, breakthroughs in Artificial Intelligence (AI) and the adoption of mobile devices offering speech input with Apple&#39;s Siri, Amazon&#39;s Alexa, IBM&#39;s Watson Assistant or Google Home. Conversational AI can improve our access to information by making it easier and faster to interact with online systems, either by text or speech. For example, IBM&amp;rsquo;s Watson question-answering system won the TV game-show Jeopardy! by beating humans at answering questions in 2011.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Not surprisingly, when you have an online platform with a lot of curated content, trying to activate and get more of your content consumed with improved access using AI-driven chatbots becomes a tempting option. In this blog post, I will explain the basics, and present a start towards setting up a chatbot for your platform and your content with &lt;strong&gt;Optimizely Graph&lt;/strong&gt; with as minimum code as possible.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style=&quot;font-size: 14pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Conversational AI&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Conversation AI allows easy access to various services and information via conversation and is fundamental to natural user interfaces. These user interfaces can be &lt;a href=&quot;https://www.uxmatters.com/mt/archives/2021/01/why-mobile-apps-need-voice-interfaces.php&quot;&gt;text based, but also voiced based&lt;/a&gt;&amp;nbsp;with the&amp;nbsp;&lt;a href=&quot;https://www.statista.com/statistics/277125/share-of-website-traffic-coming-from-mobile-devices/&quot;&gt;increasing use&lt;/a&gt; of mobile devices. There have been publicly available conversational systems going back many decades. For example, the precursors of today&#39;s chatbot systems relied heavily on hand-crafted rules and are very different from the data-driven conversational AI systems of today. We see that breakthroughs in deep learning (DL) and reinforcement learning (RL) are applied to conversational AI with generative AI driven by so called Large Language Models (LLMs).&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Conversational AI provides concise, direct answers to user queries based on rich knowledge drawn from various data sources including text collections such as Web documents and pre-compiled knowledge bases such as sales and marketing datasets. Search engine companies, including Google, Microsoft and Baidu, have incorporated QA capabilities into their search engines to make user experience more conversational, which is particularly appealing for mobile devices. Instead of returning ten blue links, the search engine generates a direct answer to a user query. This is in particular useful for informational queries, where the intent of the user is to look for information and get an answer to a question quickly.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;For websites that aim to get greater chances of conversions and stickiness, we see a chatbot present on the landing page. Example, &lt;a href=&quot;https://www.salesforce.com/&quot;&gt;Salesforce&lt;/a&gt; but also &lt;a href=&quot;https://www.optimizely.com/get-started/&quot;&gt;Optimizely has one&lt;/a&gt;. That does not mean that they are always driven by Conversational AI. Chatbots are also used within other channels such as WhatsApp or perhaps even your car, where users can interact with your content, while not having loaded or even seen your site.&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style=&quot;font-size: 14pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Retrieval Augmented Generation (RAG)&lt;/span&gt;&lt;/h2&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Retrieval-Augmented Generation (RAG), &lt;a href=&quot;https://arxiv.org/pdf/2005.11401&quot;&gt;coined in 2005&lt;/a&gt; decades before the GenAI boom, is an approach to generative AI that combines the strengths of traditional information retrieval systems (search engines) with the capabilities of generative large language models (LLMs).&amp;nbsp; This &lt;a href=&quot;https://blogs.nvidia.com/blog/what-is-retrieval-augmented-generation/&quot;&gt;blog post&lt;/a&gt; has more background information and history about RAG. By combining this extra knowledge with its own language skills, the AI can write text that is more accurate, up-to-date, and relevant to your specific needs. It reduces the problem of incorrect or misleading information (&quot;hallucination&quot;). It offers transparency to the model so the answers can be checked by its sources. Models have access to the most current, reliable facts, and that users have access to the model&amp;rsquo;s sources, ensuring that its claims can be checked for accuracy and ultimately trusted.&amp;nbsp;RAG also reduces the need for users to continuously train the model on new data and update its parameters as circumstances evolve. RAG can lower the computational and financial costs of running LLM-powered chatbots.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;It consists of 2 steps:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Retrieval: collect the relevant information given a query from a system with your data. This is where Optimizely Graph can be a solution.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Generation: The pre-processed retrieved information is then seamlessly incorporated into the pre-trained LLM. This integration enhances the LLM&#39;s context, providing it with a more comprehensive understanding of the topic. This augmented context enables the LLM to generate more precise, informative, and engaging responses.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;&lt;span style=&quot;font-size: 14pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Putting it to Action with Optimizely Graph&lt;/span&gt;&lt;/h2&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Overview&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;In my previous blog post, &quot;&lt;a href=&quot;/link/847c0fb0d6f54a8e836b3fd0b9dd28b3.aspx&quot;&gt;Do you know what I mean? Introducing Semantic Search in Optimizely Graph&lt;/a&gt;&quot;, I have introduced the semantic search feature in Graph. It greatly helps us in returning more relevant results to the top by better &quot;understanding&quot; the query, so the answer generated by the LLM will be more accurate and complete to the user. The RAG flow with Optimizely Graph is depicted here:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;img src=&quot;/link/e9b3c38b7a2f401aa1876aba605b02b8.aspx&quot; width=&quot;863&quot; alt=&quot;RAG with Optimizely Graph&quot; height=&quot;703&quot; /&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Besides Optimizely Graph, we can use a cloud service that hosts LLMs with APIs. In my example, I have used &lt;a href=&quot;https://groq.com/&quot;&gt;Groq&lt;/a&gt;, since it has a free trial and has public libraries in C# and Python with good documentation, so it&#39;s possible to try it first. Other very well known services are OpenAI or its Azure variant, Amazon Bedrock, Google&#39;s Vertex AI, etc. You can find the full code which is open source for my implementation in &lt;a href=&quot;https://github.com/episerver/optimizely-graph-python-sdk&quot;&gt;this Github repository&lt;/a&gt;.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Setup&lt;/span&gt;&lt;/h3&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We can setup &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/docs/synchronize-content-data#authorization&quot;&gt;Optimizely Graph by using basic authentication&lt;/a&gt;.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We can create the GraphQL schema types with &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/docs/synchronize-content-types&quot;&gt;an API as described here&lt;/a&gt;, and you will find an example for this demo here. Also see my tutorial with &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/docs/understanding-your-data&quot;&gt;an another example with IMDb data&lt;/a&gt; that illustrates the use of aggregating on information from &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/docs/synchronize-content-types&quot;&gt;multiple sources in Graph&lt;/a&gt; using &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/docs/joins-with-linking&quot;&gt;joins&lt;/a&gt;, and the use of &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/docs/semantic-search&quot;&gt;semantic search&lt;/a&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We use the GQL Python package to interface with Optimizely Graph, and use the &lt;a href=&quot;https://docs.developers.optimizely.com/platform-optimizely/v1.4.0-optimizely-graph/reference/post-graphqlv2handler&quot;&gt;Content V2 endpoint&lt;/a&gt; to send GraphQL queries to Graph.&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We use the Groq Python package to interface with the Groq APIs. In this blog post, we will use &lt;a href=&quot;https://console.groq.com/docs/api-reference#chat-create&quot;&gt;Groq&#39;s completion API.&lt;/a&gt;&amp;nbsp;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We use the &lt;a href=&quot;https://www.nltk.org/&quot;&gt;NLTK&lt;/a&gt; package, which is used for natural language processing tasks, to do some preprocessing of the search text sent to Graph, so we can get more accurate results ranked to the top.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Sync Data to Graph&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We can sync data to Graph by modeling the schema with content types first. The content types consist of the different types of content, the properties (field name and type), and relationship between the properties. We model based on a small &lt;a href=&quot;https://github.com/vatsalsaglani/GraphRAG4Rec/blob/main/src/imdb/data/imdb_top_100.json&quot;&gt;IMDb dataset created here&lt;/a&gt;. This is how we can define the data in Optimizely Graph, which consists of the metadata of popular movies extracted from IMDb:&lt;/span&gt;&lt;/p&gt;
&lt;pre class=&quot;language-javascript&quot;&gt;&lt;code&gt;{
  &quot;label&quot;: &quot;IMDB2&quot;,
  &quot;languages&quot;: [
    &quot;en&quot;
  ],
  &quot;contentTypes&quot;: {
    &quot;Record&quot;: {
      &quot;abstract&quot;: true,
      &quot;contentType&quot;: [],
      &quot;properties&quot;: {
        &quot;ContentType&quot;: {
          &quot;type&quot;: &quot;[String]&quot;
        }
      }
    },
    &quot;Movie&quot;: {
      &quot;contentType&quot;: [
        &quot;Record&quot;
      ],
      &quot;properties&quot;: {
        &quot;id&quot;: {
          &quot;type&quot;: &quot;String&quot;
        },
        &quot;ContentType&quot;: {
          &quot;type&quot;: &quot;[String]&quot;
        },
        &quot;genre&quot;: {
          &quot;type&quot;: &quot;[String]&quot;,
          &quot;searchable&quot;: true
        },
        &quot;year&quot;: {
          &quot;type&quot;: &quot;String&quot;
        },
        &quot;title&quot;: {
          &quot;type&quot;: &quot;String&quot;,
          &quot;searchable&quot;: true
        },
        &quot;certificate&quot;: {
          &quot;type&quot;: &quot;String&quot;
        },
        &quot;overview&quot;: {
          &quot;type&quot;: &quot;String&quot;,
          &quot;searchable&quot;: true
        },
        &quot;director&quot;: {
          &quot;type&quot;: &quot;String&quot;,
          &quot;searchable&quot;: true
        },
        &quot;cast&quot;: {
          &quot;type&quot;: &quot;[String]&quot;,
          &quot;searchable&quot;: true
        }
      }
    }
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Then we can proceed by massaging the data to the jsonl format that allows it to be synced to Graph, and then sync it. This is how we could do it:&lt;/span&gt;&lt;/p&gt;
&lt;pre class=&quot;language-python&quot;&gt;&lt;code&gt;SOURCE = &quot;imdb2&quot;
HOST = os.getenv(&#39;HOST&#39;, &quot;https://cg.optimizely.com&quot;)
SCHEMA_SYNC_ENDPOINT = &quot;{}/api/content/v3/types?id={}&quot;.format(HOST, SOURCE)
DATA_SYNC_ENDPOINT = &quot;{}/api/content/v2/data?id={}&quot;.format(HOST, SOURCE)
AUTH_TOKEN = os.getenv(&#39;AUTH_TOKEN&#39;, &quot;&amp;lt;TOKEN&amp;gt;&quot;)
HEADERS = {
    &#39;Content-Type&#39;: &#39;text/plain&#39;,
    &#39;Authorization&#39;: &#39;Basic &#39; + AUTH_TOKEN
}

SCHEMA_FILE = &quot;models/content_types.json&quot;
MOVIE_FILE = &quot;data/imdb_top_100.json&quot;


# reset data
def reset_data():
    response = requests.request(&quot;DELETE&quot;, DATA_SYNC_ENDPOINT + &quot;&amp;amp;languages=en&quot;, headers=HEADERS)


# load schema
def load_schemas():
    with open(SCHEMA_FILE) as f:
        schema = json.dumps(json.load(f))
        response = requests.request(&quot;PUT&quot;, SCHEMA_SYNC_ENDPOINT, headers=HEADERS, data=schema)


# load the data
def load_data(source, content_type, language):
    with open(source) as f:
        contents = json.load(f)
        bulk = &quot;&quot;
        for i, content in enumerate(contents):
            content[&quot;ContentType&quot;] = [&quot;Record&quot;, content_type]
            content[&quot;Status&quot;] = &quot;Published&quot;
            content[&quot;_rbac&quot;] = &quot;r:Everyone:Read&quot;
            content[&quot;__typename&quot;] = content_type
            content[&quot;genre___searchable&quot;] = content.pop(&quot;genre&quot;)
            content[&quot;title___searchable&quot;] = content.pop(&quot;title&quot;)
            content[&quot;overview___searchable&quot;] = content.pop(&quot;overview&quot;)
            content[&quot;director___searchable&quot;] = content.pop(&quot;director&quot;)
            content[&quot;cast___searchable&quot;] = content.pop(&quot;cast&quot;)
            content.pop(&#39;llm_text&#39;, None)
            bulk += &quot;{\&quot;index\&quot;: { \&quot;_id\&quot;: \&quot;&quot; + source + str(i) + &quot;\&quot;, \&quot;language_routing\&quot;: \&quot;&quot; + language + &quot;\&quot; }}\n&quot; + simplejson.dumps(content, ignore_nan=True)
            if i != len(contents)-1:
                bulk += &quot;\n&quot;
        response = requests.request(&quot;POST&quot;, DATA_SYNC_ENDPOINT, headers=HEADERS, data=bulk)


reset_data()
load_schemas()
load_data(MOVIE_FILE, &quot;Movie&quot;, &quot;en&quot;)&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Rewrite the Question by Augmentation&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We apply co-reference resolution to questions where there are ambiguous pronouns in a question that are used in a follow-up question. We rewrite the question by augmentation, so a more concrete question can be asked to Optimizely Graph and the LLM can give us the right answer. We can use the LLM to do this (as &lt;a href=&quot;https://thenewstack.io/improving-chatgpts-ability-to-understand-ambiguous-prompts/&quot;&gt;this blog post also explains&lt;/a&gt;), so we are using the LLM to rewrite the questions, and then use it again to get answers. This is a concept called &lt;em&gt;prompt engineering&lt;/em&gt;, where we instruct the LLM what to do, and feed it with examples, so it learns.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Image this conversation:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;strong&gt;Q:&lt;/strong&gt; &lt;em&gt;Who is James Cameron?&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;strong&gt;A:&lt;/strong&gt; James Cameron is a director of movies.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;strong&gt;Q:&lt;/strong&gt; &lt;em&gt;What movies did he direct?&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;strong&gt;A:&lt;/strong&gt; James Cameron directed the following movies: 1. Titanic (1997) 2. Avatar (2009) 3. Terminator 2: Judgment Day (1991) 4. Aliens (1986) 5. The Terminator (1984)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Without co-reference resolution, &quot;he&quot; cannot be resolved and Optimizely Graph cannot return the right information. So &quot;&lt;em&gt;What movies did he direct?&quot; &lt;/em&gt;will be augmented as &quot;&lt;em&gt;What movies did James Cameron direct?&lt;/em&gt;&quot;. This is how it could be implemented:&lt;/span&gt;&lt;/p&gt;
&lt;pre class=&quot;language-python&quot;&gt;&lt;code&gt;def rewrite_question(question, previous):
    post_prompt = &quot;Don&#39;t explain your answers. Don&#39;t give information not mentioned in the CONTEXT INFORMATION.&quot;
    chats = []
    if previous:
        chats.insert(0, {&quot;role&quot;: &quot;system&quot;, &quot;content&quot;: &quot;You only rewrite the question. Do not change the question type. Do not explain. You do not know anything about movies or actors. Return one Question with only coreference resolution based on the Context. Example, replace \&quot;he\&quot; with a name. If you cannot apply coreference resolution, then just return the original Question without any changes.&quot;})
        chats.append({&quot;role&quot;: &quot;user&quot;, &quot;content&quot;: f&quot;Context: {previous}\n\nQuestion: {question}&quot; + post_prompt})
        rewritten_question = get_chat_completion(chats=chats)
        return rewritten_question
    return question
&lt;/code&gt;&lt;/pre&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Query Optimizely Graph with Semantic Search&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Now we have synced the data to Optimizely Graph. Time to query. We can define a GraphQL query template, where we match and rank on the searchable fields using semantic search. Additionally, we apply some weights to some fields. GraphQL returns to us a dictionary of movies with properties, which it can use to return a meaningful answer. This is how it could look like:&lt;/span&gt;&lt;/p&gt;
&lt;pre class=&quot;language-javascript&quot;&gt;&lt;code&gt;{
  Movie(
    locale: en
    where: {
      _or: [
        { cast: { match: &quot;sigourney weaver?&quot;, boost: 2 } }
        { director: { match: &quot;sigourney weaver?&quot;, boost: 3 } }
        { title: { match: &quot;sigourney weaver?&quot;, boost: 10 } }
        { _fulltext: { match: &quot;sigourney weaver?&quot;, fuzzy: true } }
      ]
    }
    orderBy: { _ranking: SEMANTIC }
    limit: 10
  ) {
    items {
      cast
      director
      overview
      title
      genre
      year
    }
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Stopwords (common words like &quot;is&quot;, &quot;the&quot;, &quot;a&quot;, etc) can introduce noise in the semantic search query and return undesired results to the top. To improve the queries because we have a very small dataset with little content and not a normal distribution of words, we do some simple pre-processing by using the Natural Language Toolkit package to remove stopwords. For larger datasets, this step may not be needed. The retrieval stage is very important in RAG as it determines the quality of answers and the likelihood of returning &quot;I don&#39;t know&quot;. We will continuously improve the matching and ranking in Optimizely Graph, and allow you to introduce more customization as well, as I originally posited in my &lt;a href=&quot;/link/24e13898fe1b4ca0a1aa0e0a33693b81.aspx&quot;&gt;first blog post&lt;/a&gt; &lt;em&gt;&quot;Why Optimizely Graph is Search as a Service&quot;&lt;/em&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Use a LLM to Generate an Answer From Graph&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Now we got results from Graph. We can feed this to Groq. We store the chat history and reuse it as context for the LLM. Saving the chat history and reusing it as the prompt context improves the answers, since more context is used. The idea is that a chat session is preserved. However, the context has a size limit. In my example, I have set the limit of the chat history to 5. Note that my example works for a single user in a single session. In case of multiple users and multiple sessions, the chat history needs to be stored and retrieved per user as well.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We use Meta&#39;s &lt;a href=&quot;https://ai.meta.com/blog/meta-llama-3/&quot;&gt;llama3-8b-8192&lt;/a&gt; model and the chat completions API. To make sure we only get answers based on the context, we give instructions to the LLM. This is what we instruct our model to do:&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style=&quot;font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;em&gt;&lt;span style=&quot;font-size: 12pt;&quot;&gt;You are a movie expert. You return answers as full sentences. If there is an enumeration, return a list with numbers. If the question can&#39;t be answered based on the context, say &quot;I don&#39;t know&quot;. Do not return an empty answer. Do not start answer with: &quot;based on the context&quot;. Do not refer to the &quot;given context&quot;. Do not refer to &quot;the dictionary&quot;. You learn and try to answer from contexts related to previous question.&lt;/span&gt;&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We wrap this in a Flask app, so we can try it out in the browser with a simple UI. This is an example conversation with our RAG chatbot. It returns answers based on only the results from Optimizely Graph, but it&#39;s possible to include external information by increasing the &quot;temperature&quot; parameter. The format and style of answers can be configured in the prompts. Want to do more and improve? Feel free to try out and test manually, or evaluate with for example &lt;a href=&quot;https://docs.ragas.io/en/stable/&quot;&gt;RAGAS&lt;/a&gt;. Only this way, you can find the sweet spot for best results.&lt;/span&gt;&lt;/div&gt;
&lt;h3&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Some Results&lt;/span&gt;&lt;/h3&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;We have built a simple Flask app to try out the chatbot. Here you see an example where we ask direct and unambiguous questions.&lt;/span&gt;&lt;/p&gt;
&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;img src=&quot;/link/981d7e5d6b124ad9b937fd8c5711985e.aspx&quot; width=&quot;620&quot; alt=&quot;RAG with Optimizely Graph&quot; height=&quot;943&quot; style=&quot;border-style: none;&quot; /&gt;&lt;/span&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;And what about using follow-up questions and putting co-reference resolution into practise? A question like &quot;&lt;em&gt;can you give me a summary of it?&lt;/em&gt;&quot; will be rewritten as &quot;&lt;em&gt;can you give me a summary of The Avatar&lt;/em&gt;&quot; Here you see an example longer conversation where we see the importance of augmenting questions.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;img src=&quot;/link/4525b621c35e41c4a21f002ccb47f9fc.aspx&quot; width=&quot;620&quot; alt=&quot;Optimizely Graph RAG co-reference resolution 1&quot; height=&quot;655&quot; style=&quot;border-style: none;&quot; /&gt;&lt;img src=&quot;/link/4503d1d7bfa9442c99a48186d12694a4.aspx&quot; width=&quot;620&quot; alt=&quot;Optimizely Graph RAG co-reference resolution 2&quot; height=&quot;629&quot; style=&quot;border-style: none;&quot; /&gt;&lt;/span&gt;&lt;/p&gt;
&lt;h2&gt;&lt;span style=&quot;font-size: 14pt; font-family: verdana, geneva, sans-serif;&quot;&gt;Wrap Up&lt;/span&gt;&lt;/h2&gt;
&lt;/div&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;I have explained some history and the background of LLM-powered chatbots using an approach called Retrieval Augmented Generation (RAG). With an example implementation, I have show-cased how we can use Optimizely Graph to implement RAG on your content. Adding a chatbot next to the search box on your site &amp;nbsp;&amp;mdash; all driven by Optimizely Graph &amp;mdash; can be possible. You can use this as a reference, treat it as a starter and extend/improve on it to add an actual production-ready chatbot on your site that is using Optimizely Graph. The code for this example implementation is open source, and you can &lt;strong&gt;&lt;a href=&quot;https://github.com/episerver/optimizely-graph-python-sdk/tree/main&quot;&gt;find it here&lt;/a&gt;&lt;/strong&gt;.&amp;nbsp;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style=&quot;font-size: 12pt; font-family: verdana, geneva, sans-serif;&quot;&gt;&lt;em&gt;Optimizely Graph allows you to aggregate on your CMS content and other data with GraphQL, including using advanced search functionality driven by AI, so you can deliver it to any channel, app or device. It is part of &lt;a href=&quot;https://www.optimizely.com/company/press/optimizely-one/&quot;&gt;Optimizely One&lt;/a&gt;. It is included in the Optimizely PaaS offering as well as the CMS SaaS.&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;</id><updated>2024-08-26T16:13:51.0000000Z</updated><summary type="html">Blog post</summary></entry> <entry><title>Do you know what I mean? Introducing Semantic Search in Optimizely Graph</title><link href="https://world.optimizely.com/blogs/juntes-blog/dates/2023/10/do-you-know-what-i-mean-introducing-semantic-search-in-optimizely-graph/" /><id>&lt;p&gt;It seems not long time ago that search engines allowed users to search beyond strings and match on words. Understanding the basic structure of a natural language was good enough for search engines to move beyond string matching (like done in relational databases) and match on so-called tokens. Often, these tokens are words. A word is a basic element of a language that has meaning. However, search engines were not intelligent enough to understand the meaning of these words. Instead of matching on a very long string, it would be matching on tokens. Basically, all tokens were treated as a bag of words, where we did not assume any relation between these words. A na&amp;iuml;ve, but very effective approach in &lt;em&gt;information retrieval&lt;/em&gt; (IR, or the science behind search).&lt;/p&gt;
&lt;p&gt;But with the advent of deep learning and other AI advancements, we are seeing new developments in search engines as well in the area of semantic search. In this post, I will try to explain the history behind the technology, the origin of the most important terminology used in this domain, and what this technology is. And how we support this in Optimizely Graph, or &lt;em&gt;Graph&lt;/em&gt; in short.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;How a full-text search engine works in a nutshell&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Full-text search engines store text and return results efficiently. Key features are advanced matching on these texts, blazing retrieval speed, and intelligent scoring of results, so that users get most relevant information as fast as possible. &amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Data is stored by creating tokens. Tokens created for indexing are commonly stored in full-text search engines with a special data structure called the inverted index. Basically, it is a special table optimized for quick lookups. The index is called inverted, because the tokens (and not the documents) are stored in a column store with extra dimensions, like the reference to the documents containing the token, its frequency in a document, the positions of a token in a stream of tokens, etc. Here we see a low-level visualization of this in&lt;a href=&quot;https://lucene.apache.org/&quot;&gt; Lucene&lt;/a&gt;, which is an open-source core search library that drives a lot of the site search on the web. We see that content and its metadata is indexed into special files, and together these are called a index segment. An index consists of these segments.&amp;nbsp;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;/link/7987f71439044ff0a1d652e6d4900e7a.aspx&quot; width=&quot;688&quot; alt=&quot;Datastructure used in Lucene&quot; height=&quot;215&quot; /&gt;&lt;/p&gt;
&lt;p&gt;Crucially here, is that these dimensions are used to represent a token in a vector space. We speak then of term-based vectors, or sparse embeddings, because there can be one (token present or not present) or many dimensions, and since most documents contain only a small subset of the full vocabulary, these vectors are considered sparse since zero values will be plentiful. Note that uninverted stores (so called doc values) are also used as &lt;a href=&quot;https://blog.parse.ly/lucene/&quot;&gt;explained in this blog post&lt;/a&gt;, for example to efficiently create facets and drive analytics.&lt;/p&gt;
&lt;p&gt;Anyway, the sparse vector embeddings are used to match and most importantly, rank the results. In 1975, &lt;a href=&quot;https://dl.acm.org/doi/10.1145/361219.361220&quot;&gt;G. Salton et al.&lt;/a&gt; came up with the vector space model for indexing. This was purely an algebraic approach, but this was extended with a probabilistic relevance model in the 1970s and 1980s by Stephen E. Robertson, Karen Sp&amp;auml;rck Jones, and others and eventually &lt;a href=&quot;https://trec.nist.gov/pubs/trec3/papers/city.ps.gz&quot;&gt;presented it in 1994 at TREC&lt;/a&gt; (a conference where information retrieval researchers work on text retrieval methodologies) as Okapi BM25, or BM25 in short.&lt;/p&gt;
&lt;p&gt;The basic idea of BM25 is that the search engine considers the frequency of a word, the number of documents containing a word (with the idea to make more frequent terms ranked lower), document length normalization, and query term saturation to mitigate the impact of very frequent words. It is used till now as the baseline of relevance ranking for full-text search and the default ranking used in solutions like Elasticsearch and Apache Solr. And not surprisingly, Optimizely Graph uses BM25 also as the default relevance ranking model. &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.1.0-search-and-navigation/docs&quot;&gt;Search &amp;amp; Navigation&lt;/a&gt; though, is still using the vector space model from G. Salton et al.&lt;/p&gt;
&lt;p&gt;That is not to say that BM25 is the best model out there. A plethora of research papers have shown that using a pure probabilistic model with &lt;a href=&quot;https://www.elastic.co/blog/language-models-in-elasticsearch&quot;&gt;language models&lt;/a&gt; can outperform BM25.&amp;nbsp;And ranking results is much more sophisticated than just statistics and probability theory on the tokens that you have stored. Search has been significantly improved by incorporating many other signals in ranking models. Can anyone still remember &lt;a href=&quot;https://en.wikipedia.org/wiki/PageRank&quot;&gt;PageRank&lt;/a&gt;? Google made web search commodity because of the highly successful application of this, where the number of incoming links to a page could be used to boost, so frequently linked pages showed up higher. Google has said that more than 200 signals are used for their search. Indeed, using signals like these are part of artificial intelligence, which helps to improve getting the most relevant content shown on the screen of the user.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;AI comes knocking at the door&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;We thought that web search was a solved problem for most users. Users seem satisfied with what they got from Google. However, with the boom happening with generative AI, made possible with the huge improvements in deep learning, new ways to interact with information and new insights that we can gain have been unlocked. Instead of scrolling through a list of text, chat bots can directly deliver the answer that you were looking for. Talking to a device and get a satisfying verbal response from that device is no longer science-fiction.&lt;/p&gt;
&lt;p&gt;Key to generative AI technology is the concept of transforming text into numbers. We need this, because machine learning models understand numbers, but not words. We can train on a lot of text and build so called &lt;em&gt;Large Language Models&lt;/em&gt; (LLMs), which consists of numerical representations. To understand this core technology, I can recommend the &lt;a href=&quot;https://ig.ft.com/generative-ai/&quot;&gt;visual explanation on transformers here&lt;/a&gt;. &amp;nbsp;A word can be associated with multiple dimensions based on linguistic features, which are represented as vectors, and combined they are called an embedding. They are coordinates in the vector space. The number of dimensions will be limited because only useful ones are selected and used. That is why we call these numbers also dense vector embeddings. This is different compared to the sparse vector embeddings we calculate from the tokens stored in an inverted index.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;This example shows how we can visualize embeddings computed from Star Wars in the vector space (courtesy &lt;a href=&quot;https://medium.com/@marcusa314/visualizing-words-377624cb20c7&quot;&gt;@Marcus Alder&lt;/a&gt;). So, for example, based on the same dimensions, we will see that words that have similar dimensions will be located closer to each other in the vector space. A word like &amp;ldquo;cola&amp;rdquo; will be in a shorter distance to &amp;ldquo;cold beverage&amp;rdquo; than &amp;ldquo;hot chocolate milk&amp;rdquo;. Or in this visualization, &quot;Anakin&quot; will be closer to &quot;Luke&quot; than &quot;Endor&quot;.&lt;/p&gt;
&lt;p&gt;&lt;img src=&quot;https://miro.medium.com/v2/resize:fit:720/1*Y7K_-6HZqbii1V8Ms6SALQ.gif&quot; /&gt;&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Why we need semantic search&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;The idea of semantic search is not new, but it has become a new paradigm in industry. It is also called (though not necessarily synonyms) neural search, vector search, dense neural retrieval, etc. The idea is that the intent of searchers is understood by the search engine by understanding the meaning of the words that it gets as input. It is made possible now with LLMs, which give us the embeddings. And not to forget, crucially, we have vector search support in search engines to do vector similarity search using similarities like Euclidian, dot-product or Cosine. Vector search is here K-nearest neighbor search on the embeddings. With approximate nearest neighbor search using &lt;a href=&quot;https://arxiv.org/abs/1603.09320&quot;&gt;Hierarchical Navigable Small World&lt;/a&gt; (HNSW) graphs, we can do this at scale and efficiently with good performance.&lt;/p&gt;
&lt;p&gt;The limitation of standard full-text search with keyword matching is well known as the &lt;em&gt;vocabulary mismatch problem&lt;/em&gt;. Relevant content may not be found by site visitors when they use the all-important search box (=3x more likely to convert), because the words that they use in their queries may not occur in the content that you have created. In the worst case, the dreadful &amp;ldquo;no results found&amp;rdquo; is shown, resulting in a missed chance to convert and perhaps even churn. Synonyms can be used to solve this problem. However, creating synonym lists is time-consuming and require considerable labor though. And likely, you will always play catch-up. Moreover, with semantic search you can formulate your queries in natural language, enabling better ways to deal with queries.&lt;/p&gt;
&lt;p&gt;Solving the problem of the vocabulary mismatch is one of the quick wins. Improving the relevance ranking is another. We can boost relevant content to the top by combining scores from semantic search with standard BM25 relevance ranking. It is crucial that we do not only return results, but also return them in the best relevant order. The result that you see first has a greater chance to be clicked on. Experiments (see for example &lt;a href=&quot;https://blog.vespa.ai/improving-zero-shot-ranking-with-vespa-part-two/&quot;&gt;here&lt;/a&gt;, &lt;a href=&quot;https://opensearch.org/blog/semantic-science-benchmarks/&quot;&gt;here&lt;/a&gt; and &lt;a href=&quot;https://towardsdatascience.com/text-search-vs-vector-search-better-together-3bd48eb6132a&quot;&gt;here&lt;/a&gt;) have shown that combining both keyword-based (lexical) ranking with BM25 and vector-based search, an approach called hybrid search, works better than doing only keyword search or vector-search. And using pure vector search is not always beating BM25, especially in special verticals.&lt;/p&gt;
&lt;p&gt;Another common application of semantic search is reducing very convincing looking, but false information returned by chatbots driven by LLMs. A phenomenon known as hallucination. A very popular approach now is using &lt;a href=&quot;https://research.ibm.com/blog/retrieval-augmented-generation-RAG&quot;&gt;Retrieval Augmented Generation&lt;/a&gt; (RAG). The quality of LLMs is improved by using RAG. It offers transparency to the LLMs so the answers can be checked by its sources. Models have access to the most current, reliable facts, and users have access to the model&amp;rsquo;s sources, ensuring that its claims can be checked for accuracy and ultimately trusted. Imagine adding a chatbot powered by your content on your site using Optimizely Graph in a not too distant future. Wouldn&#39;t that be awesome?&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Semantic search in Optimizely Graph&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;Our implementation of semantic search with &lt;em&gt;Graph&lt;/em&gt; has been released. We offer it as-a-service and have done the heavy-lifting for you. The feature is experimental now, which means that the results you get now may change in the future, likely for the better as we will try to improve on it. You can find the documentation on &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/semantic-search&quot;&gt;Optimizely Graph&#39;s semantic search&lt;/a&gt; here. We use a pre-trained model that has been trained on a plethora of well-known websites such as Wikipedia and Reddit, using over 1 billion sentences. The model has been trained on English, and works on English content.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;You can start by synchronizing your content, but please define fields that you think will be useful for full-text search by setting the property as &amp;ldquo;searchable&amp;rdquo;. We support semantic search for full-text search with the field &lt;code&gt;_fulltext&lt;/code&gt; and other searchable fields using the &lt;code&gt;contains&lt;/code&gt; and &lt;code&gt;match&lt;/code&gt; &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/graphql-schema&quot;&gt;operators in our query language&lt;/a&gt;. You can enable semantic search by adding the &lt;code&gt;SEMANTIC&lt;/code&gt; ranking enum in the orderBy argument. That&amp;rsquo;s it!&lt;/p&gt;
&lt;pre class=&quot;language-markup&quot;&gt;&lt;code&gt;{
  Content(orderBy: { _ranking: SEMANTIC }, where: { _fulltext: { contains: &quot;action movie&quot; } }) {
    total
    items {
      Name
      _fulltext
    }
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And it should return results that are about an action movie, even when action movie is not mentioned in your content. For example:&lt;/p&gt;
&lt;pre class=&quot;language-markup&quot;&gt;&lt;code&gt;{
  &quot;Content&quot;: {
    &quot;total&quot;: 4,
      &quot;items&quot;: [
        {
          &quot;Name&quot;: &quot;Standard Page 12&quot;,
          &quot;MainBody&quot;: &quot;Wild Wild West is a 1999 American steampunk Western film co-produced and directed by Barry Sonnenfeld and written by S. S. Wilson and Brent Maddock alongside Jeffrey Price and Peter S. Seaman, from a story penned by brothers Jim and John Thomas. Loosely adapted from The Wild Wild West, a 1960s television series created by Michael Garrison, it is the only production since the television film More Wild Wild West (1980) to feature the characters from the original series. The film stars Will Smith (who previously collaborated with Sonnenfeld on Men in Black two years earlier in 1997) and Kevin Kline as two U.S. Secret Service agents who work together to protect U.S. President Ulysses S. Grant (Kline, in a dual role) and the United States from all manner of dangerous threats during the American Old West.&quot;,
          &quot;_fulltext&quot;: [
            &quot;Wild Wild West is a 1999 American steampunk Western film co-produced and directed by Barry Sonnenfeld and written by S. S. Wilson and Brent Maddock alongside Jeffrey Price and Peter S. Seaman, from a story penned by brothers Jim and John Thomas. Loosely adapted from The Wild Wild West, a 1960s television series created by Michael Garrison, it is the only production since the television film More Wild Wild West (1980) to feature the characters from the original series. The film stars Will Smith (who previously collaborated with Sonnenfeld on Men in Black two years earlier in 1997) and Kevin Kline as two U.S. Secret Service agents who work together to protect U.S. President Ulysses S. Grant (Kline, in a dual role) and the United States from all manner of dangerous threats during the American Old West.&quot;,
            &quot;Standard Page 12&quot;
          ]
        },
        {
          &quot;Name&quot;: &quot;Temporary Page Title&quot;,
          &quot;MainBody&quot;: &quot;The American frontier, also known as the Old West, popularly known as the Wild West, encompasses the geography, history, folklore, and culture associated with the forward wave of American expansion in mainland North America that began with European colonial settlements in the early 17th century and ended with the admission of the last few contiguous western territories as states in 1912. This era of massive migration and settlement was particularly encouraged by President Thomas Jefferson following the Louisiana Purchase, giving rise to the expansionist attitude known as \&quot;Manifest Destiny\&quot; and the historians&#39; \&quot;Frontier Thesis\&quot;. The legends, historical events and folklore of the American frontier have embedded themselves into United States culture so much so that the Old West, and the Western genre of media specifically, has become one of the defining periods of American national identity.&quot;,
          &quot;_fulltext&quot;: [
            &quot;The American frontier, also known as the Old West, popularly known as the Wild West, encompasses the geography, history, folklore, and culture associated with the forward wave of American expansion in mainland North America that began with European colonial settlements in the early 17th century and ended with the admission of the last few contiguous western territories as states in 1912. This era of massive migration and settlement was particularly encouraged by President Thomas Jefferson following the Louisiana Purchase, giving rise to the expansionist attitude known as \&quot;Manifest Destiny\&quot; and the historians&#39; \&quot;Frontier Thesis\&quot;. The legends, historical events and folklore of the American frontier have embedded themselves into United States culture so much so that the Old West, and the Western genre of media specifically, has become one of the defining periods of American national identity.&quot;,
            &quot;Temporary Page Title&quot;
          ]
        },
        {
          &quot;Name&quot;: &quot;Wilder Westen&quot;,
          &quot;MainBody&quot;: &quot;Wilder Westen ist eine &amp;ndash; geographisch und historisch grob eingegrenzte &amp;ndash; umgangssprachliche Bezeichnung f&amp;uuml;r die ungef&amp;auml;hr westlich des Mississippi gelegenen Gebiete der heutigen Vereinigten Staaten. In der auch als &amp;bdquo;Pionierzeit&amp;ldquo; bezeichneten &amp;Auml;ra des 19. Jahrhunderts waren sie noch nicht als Bundesstaaten in die Union der Vereinigten Staaten aufgenommen. Im Verlauf der voranschreitenden Landnahme und Urbanisierung nahm die Besiedlung dieser Regionen vor allem durch Angloamerikaner &amp;ndash; bzw. aus Europa stammende Immigranten &amp;ndash; kontinuierlich zu, bis die Gebiete um 1890 in den organisierten Territorien der Vereinigten Staaten aufgingen. Symbolisch stehen die &amp;Ouml;ffnung der letzten Indianerterritorien im sp&amp;auml;teren US-Bundesstaat Oklahoma f&amp;uuml;r die Besiedlung durch Kolonisten 1889&amp;ndash;1895 durch eine Serie von Land Runs und das Massaker der United States Army an etwa 200 bis 300 Lakota am Wounded Knee Creek/South Dakota im Dezember 1890 f&amp;uuml;r das Ende der Zeit des Wilden Westens. Mit diesen Ereignissen galten die Indianerkriege ebenso als abgeschlossen wie die Kolonisation der bis dahin von den Vereinigten Staaten beanspruchten Hoheitsgebiete (engl. territories) durch die aus Europa eingewanderten Siedler.&quot;,
          &quot;_fulltext&quot;: [
            &quot;Wilder Westen ist eine &amp;ndash; geographisch und historisch grob eingegrenzte &amp;ndash; umgangssprachliche Bezeichnung f&amp;uuml;r die ungef&amp;auml;hr westlich des Mississippi gelegenen Gebiete der heutigen Vereinigten Staaten. In der auch als &amp;bdquo;Pionierzeit&amp;ldquo; bezeichneten &amp;Auml;ra des 19. Jahrhunderts waren sie noch nicht als Bundesstaaten in die Union der Vereinigten Staaten aufgenommen. Im Verlauf der voranschreitenden Landnahme und Urbanisierung nahm die Besiedlung dieser Regionen vor allem durch Angloamerikaner &amp;ndash; bzw. aus Europa stammende Immigranten &amp;ndash; kontinuierlich zu, bis die Gebiete um 1890 in den organisierten Territorien der Vereinigten Staaten aufgingen. Symbolisch stehen die &amp;Ouml;ffnung der letzten Indianerterritorien im sp&amp;auml;teren US-Bundesstaat Oklahoma f&amp;uuml;r die Besiedlung durch Kolonisten 1889&amp;ndash;1895 durch eine Serie von Land Runs und das Massaker der United States Army an etwa 200 bis 300 Lakota am Wounded Knee Creek/South Dakota im Dezember 1890 f&amp;uuml;r das Ende der Zeit des Wilden Westens. Mit diesen Ereignissen galten die Indianerkriege ebenso als abgeschlossen wie die Kolonisation der bis dahin von den Vereinigten Staaten beanspruchten Hoheitsgebiete (engl. territories) durch die aus Europa eingewanderten Siedler.&quot;,
            &quot;Wilder Westen&quot;
          ]
        },
        {
          &quot;Name&quot;: &quot;Arnold Schwarzenegger&quot;,
          &quot;MainBody&quot;: null,
          &quot;_fulltext&quot;: [
            &quot;Arnold Schwarzenegger&quot;
          ]
        }
      ]
  }
}&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Curious about how well it works on your content? Give Optimizely Graph a try.&lt;/p&gt;
&lt;h2&gt;&lt;strong&gt;Wrap up&lt;/strong&gt;&lt;/h2&gt;
&lt;p&gt;I have presented the technical background of semantic search, explained some key concepts and terminology. The different applications of semantic search. And how you can use it in &lt;em&gt;Graph&lt;/em&gt;. With the introduction of semantic search, we have taken a big step in the mission to support you with AI search as previously posited in my blog post &lt;a href=&quot;/link/24e13898fe1b4ca0a1aa0e0a33693b81.aspx&quot;&gt;&quot;Why Optimizely Graph is Search as a Service&quot;&lt;/a&gt;. So what are the next steps with AI-driven search? There are a few things. Currently, the model that we use supports English. We will soon release support for other languages, including the frequently requested Nordic languages. Another thing is improving the ranking together with standard relevance scoring. We are also looking into supporting custom models too, DIY solutions using for example LangChain, and use &lt;em&gt;Graph&lt;/em&gt; to ground the truth in chatbots.&lt;/p&gt;
&lt;p&gt;What are the next steps besides improving on AI? &lt;em&gt;Graph&lt;/em&gt; gets powerful as it becomes the gateway to your content. Having good search will prove to be crucial here to achieve continuous success. Creating great search experiences is an art. The importance of search analytics and &lt;a href=&quot;https://feedback.optimizely.com/ideas/CG-I-65&quot;&gt;tracking&lt;/a&gt; with both online and offline experimentation will become clear as new features that show how science can meet art, how we can measure performance and experiment. This also allows us to support personalization with &lt;em&gt;Graph&lt;/em&gt;. In a future blog post, I will bring you an update on that. Have other ideas? Feel &lt;a href=&quot;https://feedback.optimizely.com/?project=OG&quot;&gt;free to share or upvote on existing ideas&lt;/a&gt;.&amp;nbsp;&amp;nbsp;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Optimizely Graph allows you to aggregate on your content and other data with GraphQL, including using advanced search functionality driven by AI, so you can deliver it to any channel, app or device. It is part of the Optimizely PaaS offering as well as &lt;a href=&quot;https://www.optimizely.com/company/press/optimizely-one/&quot;&gt;Optimizely One&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;span&gt;&amp;nbsp;&lt;/span&gt;&lt;/p&gt;</id><updated>2023-10-13T11:18:04.0000000Z</updated><summary type="html">Blog post</summary></entry> <entry><title>Why Optimizely Graph is Search as a Service</title><link href="https://world.optimizely.com/blogs/juntes-blog/dates/2022/12/why-content-graph-is-search-as-a-service/" /><id>&lt;p&gt;&lt;em&gt;John H&amp;aring;kansson has already announced the public beta of a &lt;a href=&quot;/link/386fb3fd598c4978aaf6906a1bbc1505.aspx&quot;&gt;new Optimizely service called Content Graph&lt;/a&gt; (now: Optimizely Graph). Jonas Bergqvist has created a &lt;a href=&quot;/link/2aefcdfa9169413db278877740df254a.aspx&quot;&gt;tutorial on how to get started with React and TypeScript&lt;/a&gt;. I blogged previously about &lt;a href=&quot;/link/b280f38c7e9d4a5d83c1eeb6f34326cb.aspx&quot;&gt;our developer journey so far&lt;/a&gt; in creating this service. In this blog post I want to follow-up and explain why Optimizely Graph is not only for content delivery, but that it can be the search engine for your site.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;As recently as the 1990s, studies showed that most people preferred getting information from other people rather than from search engines. Back then, most people also used human travel agents to book their travel or asked a librarian to find a book. However, times have changed. During the last decades, optimization of information retrieval effectiveness has driven web search engines to new quality levels where most people are satisfied most of the time, and web search has become the preferred source of information finding.&lt;/p&gt;
&lt;p&gt;Any website that continuously publishes new content, needs to have a search engine. Optimizely Graph does exactly that. It is an Optimizely SaaS solution for creating a website with search functionality using a GraphQL API that is hosted on the CDN. A very nice content delivery API using GraphQL that is platform independent. However, I would argue that Optimizely Graph primarily allows you to build an advanced search engine, and not merely building a website, as it can do much more.&lt;/p&gt;
&lt;h2&gt;Why Building a Search Engine is Hard?&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;&amp;ldquo;Consider a future device &amp;hellip;&amp;nbsp;&amp;nbsp;in which an individual stores all his books, records, and communications, and which is mechanized so that it may be consulted with exceeding speed and flexibility. It is an enlarged intimate supplement to his memory.&amp;rdquo;&lt;/em&gt; &amp;ndash; Vannevar Bush, 1945&lt;/p&gt;
&lt;p&gt;The vision of having a single system which you can access as an archive with speed and flexibility has been realized with the advent of search engines. A search engine is a complex system. Building and maintaining one is very hard and costly. From optimally storing the data in a way that you can perform sophisticated algorithms on for matching and ranking, to retrieving it as fast as possible, to processing queries that capture simple or very complicated information needs.&lt;/p&gt;
&lt;p&gt;We have off-the-shelf search engines that allow you to build your own search engine with your data. Well-known ones are Apache Solr, Elasticsearch and OpenSearch which are driven by the information retrieval library Lucene. Other lesser-known examples, but not less powerful, are for example Sphinx or Vespa. What they all have in common is that you need to spend considerable time and effort in creating an application with them like infrastructure configuration and hosting, defining/tweaking index schemas, preparing data for ingestion, and writing optimal queries. In short, a lot of preparation and testing is needed before you can create and deploy your search engine.&lt;/p&gt;
&lt;p&gt;Elasticsearch is offered as-a-service in a single-tenant environment by Elastic, and OpenSearch as well by AWS and others. Basically, this means that it is offered as platform-as-a-service. You no longer need to manage the hosting and operations yourself, but you are still responsible for the higher software layer and use this platform it in the best possible way in your application. Getting it implemented is one thing, getting it right so you get good performance and conversion rates is difficult and another cup of tea. Something that requires continuous experimentation and improvements.&lt;/p&gt;
&lt;p&gt;How can we offer a search engine from platform-as-a-service to truly software-as-a-service, where we do the heavy lifting for you? And how can we make sure you will get it right for your business context?&amp;nbsp;&lt;/p&gt;
&lt;h2&gt;A Search Engine as a Service?&lt;/h2&gt;
&lt;p&gt;Imagine that we can move up in the value chain, where we do all the ingestion of your content, content enrichment, configurations, tweaking and tuning of the most efficient and effective queries, and most useful ranking models for you. All that you must do to get started is adding a few lines of configuration and hitting the button. That would be truly software as a service, and in the case of Optimizely Graph, search (engine) as a service.&lt;/p&gt;
&lt;p&gt;This is exactly what we have created.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Optimizely Graph is a multi-tenant cloud service that offers you search as a service with GraphQL. We believe that GraphQL allows you to create intuitive and simple queries without a steep learning curve, because it is grounded in a strongly typed schema that offers introspection support. The query language that we developed with GraphQL allows you to query for your content created in Content Cloud, so you could create a whole site with a single GraphQL query. And it allows you to create site search by using the same API with predefined query templates to filter, match, retrieve and rank the results given keywords entered by site visitors. All you need is one endpoint and authentication keys provisioned by the DXP portal. Sounds easy right?&lt;/p&gt;
&lt;p&gt;Besides search features, we have worked hard on our infrastructure code and configurations. You do not need to worry about doing operations. We will manage that for you. That does not mean we will be constantly busy manually managing the clusters. The beating (search engine) heart of Optimizely Graph is &lt;em&gt;now&lt;/em&gt; OpenSearch. We have spent quite some time in automation of our multi-tenant distributed platform, so we can do rolling upgrades of newer (major) versions of OpenSearch containing bug fixes, performance improvements and new features, and when necessary, upgrade all indices after the upgrade without interrupting or degrading the service. We also have auto-scaling of our distributed search engine clusters, so we will be ready for Black Friday type of scenarios. At the same time, we realize that in a multi-tenant platform, performance isolation could be needed. Have a very noisy neighbor on the same cluster? Or is there a need to have single-tenant support? We are ready.&lt;/p&gt;
&lt;h2&gt;What Optimizely Graph can do as a Search Engine&lt;/h2&gt;
&lt;h3&gt;Ranking &amp;amp; Matching&lt;/h3&gt;
&lt;p&gt;Optimizely Graph offers a query language that allows you to precisely filter, select and navigate the information you need. What you request is exactly what you get, preferably at the top. We have value-based ordering by fields and have state-of-the-art BM25 relevance ranking. The information will be optimally returned at blazing speed. You can rank your results very differently, but also accurately and efficiently. The results are ranked based on filtering in values in fields with wildcard support (where we optimized suffix searches) or full-text search with language analysis on fields that you want to support. We support text analysis for all languages that &lt;em&gt;Search &amp;amp; Navigation&lt;/em&gt; supports, and have improved the full-text search in German and the CJK languages. This allows you to increase visitor engagement on your site by adding the all-important search box. Note that site visitors are 3x more likely to convert when they use the search box.&lt;/p&gt;
&lt;h3&gt;Query with relations&lt;/h3&gt;
&lt;p&gt;One of the things that we have created is a way to easily query with relations in your content. Not only within a single index with different content types, but also among indices with potentially different data sources. There is a reason why this service is called Optimizely Graph after all. One of the first relations you can query on are &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/parentchild-queries&quot;&gt;parent-child relations&lt;/a&gt;. This means you can use one query to get information of both a parent document and its linked child documents. The powerful thing here is that you can do filtering and full-text search and add facets as you query with this relation type -- all with a single query. Content can be queried as a graph.&amp;nbsp;&lt;/p&gt;
&lt;h3&gt;Navigation&lt;/h3&gt;
&lt;p&gt;Besides search, we offer different ways to do navigation. Obviously you can navigate through your search space with &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/skip-limit&quot;&gt;pagination&lt;/a&gt; and &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/cursor&quot;&gt;cursors&lt;/a&gt;, but more interesting, you can use &lt;em&gt;facets&lt;/em&gt;. So, you can present very different views of your content by just tweaking the ranking and/or &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/facets&quot;&gt;adding facets&lt;/a&gt; and filters. With facets, we also offer the support of multi-select which is conventional in the e-commerce domain. This is a great navigation technique that allows you to simultaneously zoom in and zoom out on your data. Great way to distribute diverse (personalized) content with visitors across different channels, but also between systems such as other search engines using web crawlers.&lt;/p&gt;
&lt;h3&gt;Search customization&lt;/h3&gt;
&lt;p&gt;The challenge for us is to create a solution that is generic but also flexible enough to realize different use-cases that could be possible.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&amp;ldquo;As such, there is no one-size-fits-all approach that anyone can offer you. The hot water that softens a carrot will harden an egg.&amp;rdquo;&lt;/em&gt; --- Clayton M. Christensen&amp;nbsp;&lt;/p&gt;
&lt;p&gt;We believe that the current set of search features allows you to build a very good search engine. We will offer more out-of-the-box search features that will be state-of-art and will make the search engine &amp;ldquo;more intelligent&amp;rdquo;, and will you to allow to opt-in to use AI search, so you will have a head start. Advances in conversational AI, such as with ChatGPT, shows that the domain of search is continuously improving and changing. I have written &lt;a href=&quot;https://github.com/juntezhang/exploring-question-answering&quot;&gt;a short blog post before &lt;/a&gt;on this topic.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;But we also realize that we need to give you the control required to configure the best search for your context and business cases, for example by &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/synonyms&quot;&gt;configuring synonyms&lt;/a&gt; and &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/boosting&quot;&gt;field boosting&lt;/a&gt;. We support this now. We intend to create more search features that will give you some control over the nuts and bolts, so you can customize. I will explain more in a future blog post. We will continue offering new search capabilities that will make a big difference for our customers, and improve and go beyond what we offer now. Have ideas or a wishlist? You are &lt;a href=&quot;https://feedback.optimizely.com/?project=PGQL&quot;&gt;very welcome to share&lt;/a&gt; them with us.&lt;/p&gt;
&lt;p&gt;Want to see some GraphQL queries before getting started? The developer documentation of Optimizely Graph with the description of its query language with plenty of example queries &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs&quot;&gt;can be found on our ReadMe&lt;/a&gt;.&lt;/p&gt;
&lt;h2&gt;Beta Availability&lt;/h2&gt;
&lt;p&gt;Optimizely Graph is available for all Optimizely DXP customers. Feel free to &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/getting-started&quot;&gt;get started&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Help us gathering bugs! Users of the beta release are encouraged to report any bugs at &lt;a href=&quot;https://www.optimizely.com/support/?_gl=1*14bemx2*_ga*NDA0ODc3OTY2LjE2NTEyMjQzMjg.*_ga_C7SLJ6HMJ5*MTY2MjU2NTE2MC4xMi4xLjE2NjI1NjczOTAuNTIuMC4w&quot;&gt;our support team&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Feature or change requests are warmly welcomed as well. You can create ideas and feedback &lt;a href=&quot;https://feedback.optimizely.com/?project=PGQL&quot;&gt;here&lt;/a&gt;.&amp;nbsp;&lt;/p&gt;</id><updated>2022-12-29T14:56:17.0000000Z</updated><summary type="html">Blog post</summary></entry> <entry><title>Content Graph, the Story So Far From a Developer&#39;s Perspective</title><link href="https://world.optimizely.com/blogs/juntes-blog/dates/2022/9/content-graph-a-new-optimizely-dxp-graphql-service-in-pubic-beta/" /><id>&lt;h2&gt;Introduction&lt;/h2&gt;
&lt;p&gt;As content is created and managed, the volume of data grows. The need to fish for information in the ocean of data becomes crucial.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&amp;ldquo;A world where everyone creates content gets confusing pretty quickly without a good search engine.&amp;rdquo;&lt;/em&gt; --- Ethan Zuckerman&lt;/p&gt;
&lt;p&gt;Here search engines -- and our new service Content Graph -- come into play! In this blog post, I will explain how we started our journey of creating this brand-new service in-house at Optimizely from a developer&#39;s point of view.&lt;/p&gt;
&lt;p&gt;We already have a very powerful fluent C# client API that allows you to search in your content called &lt;em&gt;Search &amp;amp; Navigation&lt;/em&gt;. This is tightly coupled to the CMS and site building. As we offer a DXP with a headless CMS with the Content Cloud, there is an opportunity to de-couple and offer a new service with capabilities similar to Search &amp;amp; Navigation, but one that is not coupled to a programming language or framework and is easy to use by developers and even non-developers. John H&amp;aring;kansson has already &lt;a href=&quot;/link/386fb3fd598c4978aaf6906a1bbc1505.aspx&quot;&gt;announced the public beta&lt;/a&gt; of this service called &lt;em&gt;Content Graph&lt;/em&gt;. Jonas Bergqvist has created a &lt;a href=&quot;/link/2aefcdfa9169413db278877740df254a.aspx&quot;&gt;tutorial on how to get started&lt;/a&gt; with React and TypeScript. In this blog post, I am reflecting on what we have done so far, and try to illlustrate the vision and hopes of this new service from an engineering perspective.&lt;/p&gt;
&lt;h2&gt;Opportunity to Optimize&lt;/h2&gt;
&lt;p&gt;We identified an opportunity to help our customers. But for us as engineers, we see that with a new service comes new technical opportunities to do things not only differently, but also in a better way. There are lessons to be learned from Search &amp;amp; Navigation in terms of operations but also common and not so common use-cases. We wanted to learn from our vast experience and improve as much as we can with this new service. And have a solid foundation to build on to increasingly deliver more value with quality in the future.&lt;/p&gt;
&lt;h3&gt;New architecture, newer technologies&lt;/h3&gt;
&lt;p&gt;As Content Graph is a new service, it has also been built up entirely from scratch at Optimizely. The GraphQL runtime is hosted on a CDN, which allows for the lowest and stable latencies possible across regions. This service interfaces with a set of interconnected containerized micro-services that are running on Kubernetes. We are also using OpenSearch, the open-source fork of Elasticsearch as a distributed search engine to store, index and retrieve data. This is also deployed as containers on Kubernetes. For all our micro-services and OpenSearch clusters, we have configured horizonal auto-scaling, and with the case of OpenSearch, we are using a Kubernetes operator. And this is all provisioned with automated pipelines with unit, end-to-end and performance tests to ensure continued quality.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;This setup allows our geographical distributed teams to simplify the development, release and deployment processes --- something we are contantly improving. Also, to optimize IT costs by being smarter with resource allocation of different micro-services using horizontal auto-scaling. It makes our service more stable with high availability with self-healing in case of crash of an instance of a service. And it makes us independent of any cloud provider just in case.&lt;/p&gt;
&lt;h3&gt;Operational readiness&lt;/h3&gt;
&lt;p&gt;As with any new service built from scratch, one that is offered as-a-service, we need to be prepared and ready to understand what is going on, make sure everything is up and running, and offer support in case of issues. In the beta phase of Content Graph, the development team also offers operational support.&lt;/p&gt;
&lt;p&gt;We have spent considerable time on operational readiness, i.e., to improve our logging and traceability, so we can improve our monitoring and alerting capabilities in case of system issues but also to get a greater understanding of usage metrics. Dashboards with metrics have been created and alerts via email and chat have been set.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;Good developer documentation is pivotal as well. We made writing developer documentation part of our development work. A user story will only be accepted as done when the documentation is there or has been updated. The &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs&quot;&gt;developers&#39; documentation of Content Graph&lt;/a&gt; is kept up to date as we extend our product.&lt;/p&gt;
&lt;p&gt;This all gives us a feeling that we can be the captain on our ship, as more passengers board to join this exciting journey.&lt;/p&gt;
&lt;h3&gt;Performance and experimentation&lt;/h3&gt;
&lt;p&gt;Having the best possible performance is also one of our key objectives. With performance here we mean the efficiency of processing requests and returning responses with the lowest latency possible.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&amp;ldquo;There are no speed limits on the road to success.&amp;rdquo;&lt;/em&gt; --- David W. Johnson&lt;/p&gt;
&lt;p&gt;It is a given that we will always have a hop between the CDN and our Kubernetes cluster via the internet. We reduce this by using caching in the CDN. The distribution of the queries sent to our system will likely have a long tail, but it is expected that the head of the distribution will consist of the bulk of the requests. This ensures the fastest possible performance. Content can also be frequently updated, so we will invalidate caches when there are updates. Caching will be made smarter in the future.&lt;/p&gt;
&lt;p&gt;Besides caching in the CDN, we spent time on improving the performance within the cluster by tracing the response times in the different services. Crucial here is benchmarking with performance tests. Given the same experimental conditions, we can hypothesize and tweak things --- like different configurations or different implementations --- that we think will be an improvement and compare the results. And gains of every hundred milliseconds count! Besides software development, we also experiment in an Agile way --- incrementally and iteratively. And also here, documentation is key. In a future blog post, we will offer more insights into the setup and results of our experimentation.&amp;nbsp;&lt;/p&gt;
&lt;h2&gt;Search as a Service, Reloaded&lt;/h2&gt;
&lt;p&gt;Content Graph can deliver content to you. It offers an intuitive query language that allows you to precisely filter, select and navigate the information you need based on strongly typed schemas. What you request is exactly what you get. Content delivery for building or populating sites is made simple and easy.&lt;/p&gt;
&lt;p&gt;But the core of Content Graph is driven by a search engine. Search engines are information systems designed to help find stored information with a query, which are often delivered as a list of ordered results. The primary advantage of a search engine is flexible retrieval of results with high performance in both efficiency and effectiveness. So a key differentiator of Content Graph are its search capabilities. We offer precise and powerful text matching. You can increase visitor engagement on your site by offering site search, e.g., adding a search box and facets, as well as different ways to rank your results very effectively, but also accurately and efficiently. We have value-based ordering by fields and have state-of-the-art relevance ranking. The information will be optimally returned in the order that will drive improved conversions. Great way to distribute content with visitors, but also between systems like web crawlers. We will continue offering new search capabilities that will make a big difference for our customers, and improve and go beyond what we offer now. And if you have ideas or a wishlist, you are &lt;a href=&quot;https://feedback.optimizely.com/?project=PGQL&quot;&gt;very welcome to share&lt;/a&gt; them with us.&lt;/p&gt;
&lt;p&gt;I would argue that Content Graph is in essence primarily a search engine with graph capabilities --- but one that is offered as-a-service and comes out of the box without any complicated configuration needed, required steep learning curve and know-how, or operational costs. Building a search engine with your data has become very easy to provide access to your content. The time to market is reduced. And we allow you to focus on what&amp;rsquo;s most important to you: delivering optimal user experiences, which includes search experience, with content creation on your platform. Stay tuned for a future blog post by me about this topic.&lt;/p&gt;
&lt;p&gt;So, buckle up, we are in for a fun ride! The journey continues.&lt;/p&gt;
&lt;h2&gt;Beta Availability&lt;/h2&gt;
&lt;p&gt;Content Graph is available for all Optimizely DXP customers. Feel free to &lt;a href=&quot;https://docs.developers.optimizely.com/digital-experience-platform/v1.4.0-content-graph/docs/getting-started&quot;&gt;get started&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Help us gathering bugs! Users of the beta release are encouraged to report any bugs at &lt;a href=&quot;https://www.optimizely.com/support/?_gl=1*14bemx2*_ga*NDA0ODc3OTY2LjE2NTEyMjQzMjg.*_ga_C7SLJ6HMJ5*MTY2MjU2NTE2MC4xMi4xLjE2NjI1NjczOTAuNTIuMC4w&quot;&gt;our support team&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Feature or change requests are warmly welcomed as well. You can create ideas and feedback &lt;a href=&quot;https://feedback.optimizely.com/?project=PGQL&quot;&gt;here&lt;/a&gt;.&amp;nbsp;&lt;/p&gt;
&lt;p&gt;The developers of Content Graph look forward to receiving your feedback on this beta release!&lt;/p&gt;</id><updated>2022-09-08T08:55:54.0000000Z</updated><summary type="html">Blog post</summary></entry></feed>