large language models

Web research

The goal: Conduct research and present reports on the companies specified by the user in a table format. The information provided by the tool includes: website address, company logo, funding, annual revenue, most stared github repository and a summary of their activities.

Technical procedure

1. Context retrieval:

The app performs multiple google searches over specialized websites in order to get the appropriate data.

2. Document summarization:

Summarize every chunk of the webpage content using gpt-3.5 turbo, then all the summaries are combined into the final summary.

3. Webpage rendering:

Each web page it’s rendered using selenium to avoid losing information on javascript rendered pages.

4. Data agents:

There are 7 agents specialized in obtaining each piece of information (e.g. summary, logo, annual revenue, etc.). Each of these is capable of doing Google searches, rendering web pages and deducing in one or more iterations what the correct answer is.

5. Generate response:

The user’s question and the retrieved context are sent as input to the GPT model to generate a response in natural language based on this input. The generated response is presented to the user through the chatbot interface.

User instructions

1. Chatbot interaction:

The user can provide company names to the chatbot interface and the app will display all the information it can get in a table format. The user can also specify what data field he needs.

2. Batch request:

The user can provide an email and a list of companies to process. After the process finished the results are sent to the provided email address

Questions

  1. Openai
  2. Openai, Flair, Facet ai
  3. Give me the annual revenue of Openai, Flair, Facet ai
  4. Give me the Logo and funding of Openai, Flair, Facet ai