Large Language Models for Development: Why Information Matters
A blog post for the Global Prosperity Institute on how there might also be downsides to Large Language Models (LLMs) for Least Developed Countries (LDCs) that haven’t fully been explored.
ChatGPT and other Large Language Models (LLMs) have been transformative in the few months that they have been around. For knowledge workers, this could potentially mean a multiplying of productivity in the near term. In the long term are we potentially getting closer to the Singularity, or an AGI which techno-optimists say move us closer to a post-scarcity economy? This post will explore how LLMs may impact least developed countries (LDCs) in ways that might not have been thought about in the overall discussion around LLMs uses and how this could negatively affect them.
A good way to start this post would be to ask a LLM the same question, so, if we ask Bing chat “what are the potential impacts of Large Language Models like GPT4 on least developed countries”, we get is:
This post is focusing on what the potential long term economic risks with LLMs, so digging down into the point related to “digital divide and inequality” gives the following response.
Indeed. In particular, the small (and of questionable accuracy) amount of training data available from and for LDCs.
Lack of training data
To oversimply, LLMs work through “guessing” the next word based on vast sums of training data which have been fed into their algorithms. If there is less volume and less accurate training data available about a particular subject, then the models will be less accurate. We have made a table below, which gives Google search result numbers for how many pages are available for a specific topic in the context of the US, UK, and Tanzania respectively. This gives a relative idea on how much training data is available for topics which are specific to that country.
Secondly, one of the other issues that could end up being problematic, is the quality and nuance around published information about LDCs on the Internet. Much of it is written by organisations which are not based in the country that they are writing about, which can lead to a lack of context around the subject. Most of the information is also written by research organisations and development agencies/NGOs which means often much of the private sector perspectives are missing. This adds up to the current corpus of information on these subjects being one sided, with corresponding decisions being impacted.
With LLMs this could possibly lead to an information cascade. Where the small amount of information that is currently available is incomplete and then this is parsed by the LLMs to assist writers to make additional articles that just act to cement in these falsehoods as a larger and larger proportion of the data that trains these models is derived from a small subset of original incorrect articles. Hopefully this doesn’t happen, but in the worst situations this will occur. A flow chart showing how this potential feedback loop would work is show below:
Reduction of job opportunities for low level knowledge workers
As we’ve seen from many countries which move from agriculture to manufacturing and beyond, typically one of the pathways to more productive jobs are knowledge workers that act as back-office staff in a globalised world. From the armies of tech workers in Hyderabad to the call centres of Cebu, these are roles that would potentially be automated away through more sophisticated use cases of AI.
This will mean that many of the low-income countries will need different pathways for workers to move up the income ladder as a knowledge worker. Only the most productive roles will be outsourced, and this will leave a lot of potential workers without roles in the future global economy as they will not be able to compete with a highly productive knowledge worker combined with an AI. A recent paper by Open AI[1] displayed the vast number of jobs which would be impacted using LLMs, and this is only the tip of the iceberg once people and corporations are able to properly integrate these tools into their workflows.
Is this just a rehash of the old argument made famous by the Luddite movement during the industrial revolution? Possibly, but at the same time, many of the garment workers at that time were out of a job and needed to find new skilled labour or be paid less as the mechanical looms were able to do their jobs more efficiently with less skills. We think that this should be looked at as an opportunity, the UK and the world was better for the mechanical loom, this is the same for AI. If LDCs look at this as an opportunity for workers to gain a productivity advantage, then this could be a massive boost to economic growth in the medium term. Governments will need to take this into account when designing the industrial policies and growth frameworks that will hopefully allow LDCs to eventually become middle income countries.
What can be done?
It’s still just the start of the AI revolution. We think there is the possibility for these issues to not become entrenched and for AI and LLMs to power higher productivity and increased growth in LDCs. How can we increase the chances of a positive outcome over a negative one? The first step is understanding that this is a potential issue and letting the developers of these systems know about it. Then the models can be trained to get better at weighting articles and posts which are more balanced and end up in better decision making for business, development, and policy arenas. In the same way that Wikipedia is able to correct, so should LLMs, it is just necessary that they add this understanding into their models.
On the issues of leaving people behind, there should be an emphasis on training and education using LLMs. This is a great opportunity for young people in poor countries to be able to access the wealth of the world’s knowledge in an accessible way. However, we need to ensure that people have the means and ability to access these tools through computers and smartphones, and unsurprisingly the answer to that is increased growth.
[1] Eloundou et al. 2023 “GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models” https://arxiv.org/pdf/2303.10130.pdf