by Rob Macrae
Note: Despite the overall message encouraging people to utilize ChatGPT, this post itself was not written by AI.
I'm a solo founder but it doesn't feel that way. ChatGPT is providing me with advice on various topics I'm unfamiliar with and doing actual work that I would otherwise have to do myself or pay someone for. But before I break down all the ways I'm finding utility in ChatGPT, I should first mention that I'm grateful for all the help I've received from various folks and this is not intended to discredit that. Rather, I feel the need to give credit where it's due to this amazing tool that OpenAI have built as without it, I might have never gotten Summer AI to the market.
Obviously there are major caveats working with a Large Language Model (LLM) and you should always check the output it generates. I'm suspicious there may be something akin to a Gell-Mann amnesia effect going on with LLMs where we believe in their accuracy even when tests against our own areas of expertise reveal errors, much like how we trust newspaper articles even after seeing similar errors. It's tempting to anthropomorphize AI and forget it's essentially a neat bag of parlor tricks on an extremely high dimensional corpus of language embeddings. For critical questions, you shouldn't trust the output at face value.
For everything else it has become remarkably effective. Which brings us to ChatGPT's first superpower, which is its breadth of knowledge.
Unless you are a serial startup founder who knows the ins and outs of your business area, there will likely be a lot to building your startup that you are not initially familiar with. One of the main reasons people pick a cofounder is to find complementary skills , , which is equatable to filling in the gaps in your knowledge. Because of the sheer volume and breadth of its training data, ChatGPT's knowledge base goes a long way to negating the need to fill out your teams skillsets.
The chart above uses the Dewey Decimal Classification (DDC) to organize knowledge into a wide range of categories. I asked ChatGPT to rate the knowledge of the average human, and how all the GPT models compare in that topic to an expert human baseline of 10. You can see the full prompt used
For each of the following csv formatted topics within the Dewey Decimal System we are going to add,
some columns. For each of these columns we are going to use a scale from 0.0 to 10.0, with one decimal place of
precision. The first new column will be an expert humans knowledge of the subject and we will set this as 10.0
as our benchmark for the rest. Then add a column for the average humans level of knowledge of that topic.
Then add a score for each of the GPT models GPT1, GPT2, GPT3, GPT3.5, and GPT4's knowledge of that particular topic.
The scores should be in floating point and take into account the GPT model's strengths and weaknesses in those
various subjects due to factors such as prevalence of training data etc. The scores should take into account
GPT's training cut off date and that it is now September 2023. Additionally, if the subject relates to humor or
sarcasm such as 817 or 818 then the GPT models should be penalized for being really REALLY bad at jokes.
"005","Computer programming, programs, data, security",
It is not that surprising that it ranks itself above the average human as any individuals knowledge of various subjects is quite sparse. I would consider my own knowledge to be superior to ChatGPT4's only in a handful of areas such as building AI tour guide apps like Summer AI or cooking banana pancakes. What is interesting here is the gap above the average human knowledge that even GPT 1 supposedly holds and how the gap to the expert human knowledge level has been quickly shrinking over the past 5 years.
While ChatGPT4 might not have the same level of knowledge as an expert, and its interpretation of your requirements might not always be spot on, the fact that it delivers its answers almost instantly means you can have multiple rounds to improve upon the answer in very little time, ChatGPT's second superpower.
No thanks to the pandemic, we are now used to most of our communications happening offline via slack or email. And whilst you can still talk directly to your human cofounder, if there is any work or research needed to be done, then you will still need to wait. ChatGPT delivers responses, work, and research within a matter of seconds. At an average of 9 seconds response time, ChatGPT4 actually isn't the fastest of the LLM's with both Google's Bard (3.5 seconds) and Anthropic's Claude (5 seconds) achieving faster response times on average .
Here are the main areas ChatGPT is helping build Summer AI
As a first-time founder, there are many topics to learn about running a business. There is incorporation, equity stock pools, trademarks, accounting, taxes, patents, convertible notes, SAFEs, marketing, contractors, just to name a few. There are great resources online for learning about these things and I highly recommend YC's Startup School. But often you may find yourself with nuanced questions that Google is not equipped to help find the answer to. This is one area where ChatGPT shines. If you are not worried about putting Intellectual Property into your prompts then you can even have it review business plans, suggest improvements, ideate on product and features for you.
I find the best way to utilize ChatGPT for writing code is to give it coding interview style questions. If you test it in a language you are familiar with, you will find ChatGPT4 produces code that is comparable to a mediocre software engineer. What makes it really useful however, is that it delivers these responses almost instantly and there is no fatigue or other similar concerns you may need to worry about when working with a human engineer. You can lazily copy and paste any errors without any context, and it will politely find the cause. This lets you iterate at a lightning pace which is critical to making progress and keeping focus.
One of the benefits of this working pattern is I've found myself focusing more on my abilities to manage and direct the coding than writing the code itself. And just as an engineering manager can manage software engineers working in a different language to those they are familiar with themselves, you can direct ChatGPT, and by extension, code in any language necessary.
How to best use ChatGPT for coding deserves its own article, and here is one with some tips: Harnessing the Power of ChatGPT-4 for Software Engineers.
GPT4's code output capabilities also covers front end work such as HTML, CSS, JS or Swift and you can prompt it, just as you might a human engineer, to "make a nice looking form for..." or to "space things out a bit more". I've even asked ChatGPT to help select color schemes for Summer AI and it was able to pick a favorite. Below is an excerpt from that conversation, but you can view the prompt and full response here.
Additionally, another AI model, MidJourney's V5, was responsible for creating the Summer AI logo with the prompt "letter s sound wave vector logo, simple, flat, minimal, modern, isolated on white, sun, --v 5"
Of course, the main use case for a LLM at Summer AI is for the summarizations of real world tour guide information. I actually don't rely on ChatGPT for summarizations as I've found a smaller, more focussed bespoke model can perform just as well, if not better, than ChatGPT. And no matter how much prompt engineering you try to reduce the effect, Summarizing via LLM's is still prone to hallucinations or embellishments. In particular, I've noticed that text explicitly removed from wikipedia might re-appear after summarizing, most likely due to Wikipedia's original data being part of the learned embeddings .
Unlike the previous use cases, the output of this task is intended to be consumed by Summer AI's users. Therefore, it's important to take steps to make sure its safe and correct. We have a proprietary algorithm for fact checking these summarizations against the input, and then it gets reviewed by both ChatGPT and a human moderator.
ChatGPT can also act as a moderator, reviewing the output of other LLM's. This can be to judge the quality of the output on a number of dimensions. It can also flag content or add other labels where appropriate. These edits can then filter through to a human moderator, cutting down on the amount of work needed and ensuring human eyes are more likely to review content that needs reviewing.
And of course, ChatGPT proofread this blog post.
ChatGPT reduces, and therefore potentially replaces, some of the need to bring people on board to fulfill roles that you are not familiar with. Just as larger organizations might find they can get more work done with less (the less being optional), smaller organizations and early stage startups will find it increasingly harder to justify each new hire.
I hope this has been of some help to those on the fence about utilizing LLM's when starting a business. The list of ways an AI can help is by no means meant to be exhaustive and I'm constantly discovering new use cases for it. Please get in touch if you have any feedback, suggestions, or your own stories about how AI has helped your startup.