{"id":17802,"date":"2023-03-16T01:01:04","date_gmt":"2023-03-15T14:01:04","guid":{"rendered":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/?p=17802"},"modified":"2023-03-16T12:39:21","modified_gmt":"2023-03-16T01:39:21","slug":"gpt-4-is-here-what-is-it-and-what-does-this-mean-for-higher-education","status":"publish","type":"post","link":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/gpt-4-is-here-what-is-it-and-what-does-this-mean-for-higher-education\/","title":{"rendered":"GPT-4 is here. What is it, and what does this mean for higher education?"},"content":{"rendered":"<p><span style=\"font-weight: 400;\">The internet has exploded again, this time with the release of <\/span><a href=\"https:\/\/openai.com\/research\/gpt-4\"><span style=\"font-weight: 400;\">GPT-4<\/span><\/a><span style=\"font-weight: 400;\"> on <\/span><a href=\"https:\/\/www.piday.org\/\"><span style=\"font-weight: 400;\">Pi day<\/span><\/a><span style=\"font-weight: 400;\">, 2023. But what is this next generation AI, and what impact might it have for higher education?<\/span><\/p>\n<h2>Just tell me how to get my hands on it!<\/h2>\n<p><span style=\"font-weight: 400;\">GPT-4 is a much-improved model compared to GPT-3.5, which is the large language model that powers the now infamous <\/span><a href=\"https:\/\/chat.openai.com\/\"><span style=\"font-weight: 400;\">ChatGPT<\/span><\/a><span style=\"font-weight: 400;\">. OpenAI, the company behind ChatGPT, GPT-3.5, and now GPT-4, calls GPT-4 a &#8216;large multimodal model&#8217;, because it is able to take image and text inputs, and respond through text. Currently, the image input is not available publicly \u2013 yet.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">To send text input to GPT-4 and see its improved responses, there are currently two ways. One is by having a <\/span><a href=\"https:\/\/openai.com\/blog\/chatgpt-plus\"><span style=\"font-weight: 400;\">ChatGPT Plus subscription<\/span><\/a><span style=\"font-weight: 400;\">, which costs USD20 per month. A free approach to using GPT-4 is through <\/span><a href=\"https:\/\/www.bing.com\/new\"><span style=\"font-weight: 400;\">Bing Chat<\/span><\/a><span style=\"font-weight: 400;\">, Microsoft&#8217;s AI-powered chatbot released on 7 February 2023 that was rumoured (and now <\/span><a href=\"https:\/\/twitter.com\/JordiRib1\/status\/1635694953463705600?t=ZVbIe2IbQAQuGXcwcl3dpg\"><span style=\"font-weight: 400;\">confirmed<\/span><\/a><span style=\"font-weight: 400;\">) to be powered by GPT-4. Bing Chat is still in limited public release, and there is a waiting list to get onto it (it took me about two weeks to get access). Bing Chat has the added advantage of having live access to the internet, so it is less prone to hallucination (the tendency of language models to make things up).<\/span><\/p>\n<h2>What&#8217;s so special about GPT-4 compared to previous versions?<\/h2>\n<p><span style=\"font-weight: 400;\">OpenAI has released a <\/span><a href=\"https:\/\/cdn.openai.com\/papers\/gpt-4.pdf\"><span style=\"font-weight: 400;\">technical report<\/span><\/a><span style=\"font-weight: 400;\"> alongside GPT-4&#8217;s release, which describes how GPT-4 is &#8220;less capable than humans in many real-world scenarios, [but] exhibits human-level performance on various professional and academic benchmarks&#8221;. Perhaps most strikingly, <strong>GPT-4 appears to be much better at reasoning and understanding complex scenarios<\/strong>. The report claims that GPT-4 performs in the top decile of human test-takers on a simulated bar exam, whilst GPT-3.5 (the current model behind the free version of ChatGPT) scores in the bottom decile. OpenAI&#8217;s GPT-4 <\/span><a href=\"https:\/\/www.youtube.com\/watch?v=outcGtbnMuQ&amp;ab_channel=OpenAI\"><span style=\"font-weight: 400;\">release webinar<\/span><\/a><span style=\"font-weight: 400;\"> showed GPT-4 calculating tax liabilities given a complex set of tax legislation and a scenario (skip to 19:03 in the recording), demonstrating its ability to reason through complex tasks.<\/span><\/p>\n<figure id=\"attachment_17805\" aria-describedby=\"caption-attachment-17805\" style=\"width: 270px\" class=\"wp-caption alignright\"><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf.png\"><img loading=\"lazy\" decoding=\"async\" class=\" wp-image-17805\" src=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf.png\" alt=\"Physics question in French including an image, with answer by GPT-4.\" width=\"270\" height=\"355\" srcset=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf.png 581w, https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf-228x300.png 228w, https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf-370x486.png 370w, https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf-570x749.png 570w, https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-content\/uploads\/2023\/03\/2023-03-16-00_44_29-gpt-4.pdf-442x580.png 442w\" sizes=\"auto, (max-width: 270px) 100vw, 270px\" \/><\/a><figcaption id=\"caption-attachment-17805\" class=\"wp-caption-text\">University-level physics exam question, in French, and GPT-4&#8217;s response. Source: <a href=\"https:\/\/cdn.openai.com\/papers\/gpt-4.pdf\">OpenAI<\/a>.<\/figcaption><\/figure>\n<p><span style=\"font-weight: 400;\">Because GPT-4 takes multimodal input, the report states that it scored in the top percentile in a Biology Olympiad exam which has many image-based questions. The report also has striking examples of <strong>GPT-4 answering questions that involved interpretation of complex graphical inputs<\/strong> such as charts, and a university-level physics exam question on heat transfer in a conductive bar<\/span><span style=\"font-weight: 400;\">. OpenAI demonstrated how GPT-4 could take a photograph of a hand-drawn app interface and turn it into functional code within seconds (skip to 16:17 in the <\/span><a href=\"https:\/\/www.youtube.com\/watch?v=outcGtbnMuQ&amp;ab_channel=OpenAI\"><span style=\"font-weight: 400;\">webinar recording<\/span><\/a><span style=\"font-weight: 400;\">).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GPT-4 is also meant to have the ability to process up to 32,000 &#8216;tokens&#8217; at once (a token is about 0.75 of a word in English, so about 50 pages&#8217; worth of text), compared to GPT-3.5 whose limit was about 8,000 tokens, although currently this is only available to programmers via the GPT-4 application programming interface. <strong>This means that GPT-4 can consume and generate dissertation-length writing.<\/strong> GPT-4 also has improved &#8216;steerability&#8217;, which means that users and programmers will be able to ask it to assume particular roles \u2013 OpenAI&#8217;s GPT-4 <\/span><a href=\"https:\/\/openai.com\/research\/gpt-4\"><span style=\"font-weight: 400;\">launch webpage<\/span><\/a><span style=\"font-weight: 400;\"> has a striking example of GPT-4 being steered to act as a Socratic tutor, guiding students through a question with questions, rather than providing the answer.<\/span><\/p>\n<h2>What can GPT-4 actually do in practice?<\/h2>\n<p><span style=\"font-weight: 400;\">In the few hours since its public release, people have <\/span><a href=\"https:\/\/twitter.com\/LinusEkenstam\/status\/1635754587775967233\"><span style=\"font-weight: 400;\">already been using it<\/span><\/a><span style=\"font-weight: 400;\"> to code up web applications from scratch, perform drug discovery, explain financial transactions, and generate &#8216;one-click lawsuits&#8217;. On the academic side of things, <\/span><a href=\"https:\/\/elicit.org\/gpt4-waitlist\"><span style=\"font-weight: 400;\">GPT-4 is being integrated with Elicit<\/span><\/a><span style=\"font-weight: 400;\">, an AI research assistant that rapidly analyses literature and helps you look across papers to discover shared concepts and answer your research questions. Khan Academy is developing <\/span><a href=\"https:\/\/www.khanacademy.org\/khan-labs\"><span style=\"font-weight: 400;\">Khanmigo<\/span><\/a><span style=\"font-weight: 400;\">, an AI-powered &#8220;tutor for learners&#8221;, which uses GPT-4 to personally guide students through completing questions. <\/span><a href=\"https:\/\/blog.duolingo.com\/duolingo-max\/\"><span style=\"font-weight: 400;\">Duolingo<\/span><\/a><span style=\"font-weight: 400;\">, the language learning app, is using GPT-4 to allow language learners to practise conversation skills and get explanations for mishaps. <strong>It&#8217;s possible that AI is sufficiently advanced now to start addressing the &#8216;<\/strong><\/span><strong><a href=\"https:\/\/www.edsurge.com\/news\/2014-08-10-personalization-and-the-2-sigma-problem\">2 sigma problem<\/a><\/strong><span style=\"font-weight: 400;\"><strong>&#8216;<\/strong>, where individualised teaching tailored to the needs of each student may be possible at scale.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">More immediately, <\/span><a href=\"https:\/\/twitter.com\/emollick\"><span style=\"font-weight: 400;\">Ethan Mollick<\/span><\/a><span style=\"font-weight: 400;\">, a professor at the Wharton School, has already been using GPT-4 (through Bing Chat) for a few weeks, <\/span><a href=\"https:\/\/oneusefulthing.substack.com\/p\/feats-to-astonish-and-amaze\"><span style=\"font-weight: 400;\">demonstrating that GPT-4 can be creative<\/span><\/a><span style=\"font-weight: 400;\">, drawing meaningful connections between disparate ideas. Mollick also shows that GPT-4 can apply theory to practice, across different domains and generating novel insights, in fields such as <a href=\"https:\/\/twitter.com\/emollick\/status\/1633623048958930944?t=fStwclqKOJ_erVCmNkjJRg\">marketing, research, academia, and consulting<\/a>. OpenAI&#8217;s GPT-4 technical paper cites examples of the AI answering multiple choice questions from 57 different subjects with 86% accuracy.<\/span><\/p>\n<h2>What are its limitations?<\/h2>\n<p><span style=\"font-weight: 400;\">OpenAI is careful to point out limitations of GPT-4, saying in its technical paper that &#8220;it is not fully reliable (e.g. can suffer from &#8216;hallucinations&#8217;), has a limited context window, and does not learn from experience. <strong>Care should be taken when using the outputs of GPT-4, particularly in contexts where reliability is important<\/strong>&#8220;. That said, OpenAI <\/span><a href=\"https:\/\/openai.com\/research\/gpt-4\"><span style=\"font-weight: 400;\">claims<\/span><\/a><span style=\"font-weight: 400;\"> that GPT-4&#8217;s factuality is 40% better than GPT-3.5. Additionally, GPT-4&#8217;s training data consists of web texts and other material available before September 2021 \u2013 this is similar to GPT-3.5. However, Bing Chat (powered by GPT-4) is internet-connected and does not have this limitation, which <\/span><a href=\"https:\/\/twitter.com\/emollick\/status\/1633707096960016384\"><span style=\"font-weight: 400;\">seems to reduce<\/span><\/a><span style=\"font-weight: 400;\"> the propensity to hallucinate.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GPT-4&#8217;s developers have also implemented safety measures that prevent it from generating harmful responses. The <\/span><a href=\"https:\/\/cdn.openai.com\/papers\/gpt-4-system-card.pdf\"><span style=\"font-weight: 400;\">system card for GPT-4<\/span><\/a><span style=\"font-weight: 400;\"> illuminates a number of mitigations that prevent the AI from responding to prompts that elicit biased, offensive, criminal, unsafe, or other inappropriate content. (Incidentally, pages 15 and 16 of the system card have somewhat terrifying examples of what the AI was capable of when allowed to interact with the real world).<\/span><\/p>\n<p><span style=\"font-weight: 400;\">One non-functional limitation of GPT-4 is the <strong>lack of transparency around its training data<\/strong>. OpenAI states in its <\/span><a href=\"https:\/\/cdn.openai.com\/papers\/gpt-4.pdf\"><span style=\"font-weight: 400;\">technical paper<\/span><\/a><span style=\"font-weight: 400;\"> that, &#8220;[g]iven both the competitive landscape and the safety implications of large-scale models like GPT-4, this report contains no further details about the architecture (including model size), hardware, training compute, dataset construction, training method, or similar&#8221;.<\/span><\/p>\n<h2>What does this all mean for higher education?<\/h2>\n<p><span style=\"font-weight: 400;\">Honestly, we don&#8217;t have all the answers yet. However, it&#8217;s clear that generative AI is advancing at an incredible pace (there are <\/span><a href=\"https:\/\/twitter.com\/abacaj\/status\/1635837820270002178\"><span style=\"font-weight: 400;\">rumours<\/span><\/a><span style=\"font-weight: 400;\"> that GPT-5 is already being trained) and will <\/span><a href=\"https:\/\/www.afr.com\/technology\/with-gpt-4-we-really-are-on-the-verge-of-revolutionising-office-work-20230315-p5cscd\"><span style=\"font-weight: 400;\">revolutionise the world of work<\/span><\/a><span style=\"font-weight: 400;\"> in ways that may not yet be known. Sydney&#8217;s fundamental approach of <\/span><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/ai-and-education\/\"><span style=\"font-weight: 400;\">productive and responsible engagement<\/span><\/a><span style=\"font-weight: 400;\"> with AI in learning, teaching, and assessment remains unchanged \u2013 and the need for this has been reinforced through <\/span><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/student-staff-forums-on-generative-ai-at-sydney\/\"><span style=\"font-weight: 400;\">conversations with students<\/span><\/a><span style=\"font-weight: 400;\">.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">In the short-medium term, the increased abilities for GPT-4 (and Bing Chat, and other AI tools that will inevitably be released) to be creative, solve problems, generate hypotheticals, interpet images, and draw information from the internet (as Bing Chat can do) make some existing advice somewhat outdated \u2013 even after less than a month. <strong>Updating written assignments to focus on creative thinking and problem solving and refer to contemporary sources and events will not necessarily make them AI-proof anymore.<\/strong><\/span><\/p>\n<p><span style=\"font-weight: 400;\">Most of our <\/span><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/how-can-i-update-assessments-to-deal-with-chatgpt-and-other-generative-ai\/\"><span style=\"font-weight: 400;\">previous advice<\/span><\/a><span style=\"font-weight: 400;\"> for assessment in the age of generative AI has been around building students&#8217; AI literacy, leveraging it to personalise assessment tasks, trying multimodal assessments, and <\/span><a href=\"https:\/\/theconversation.com\/as-uni-goes-back-heres-how-teachers-and-students-can-use-chatgpt-to-save-time-and-improve-learning-199884\"><span style=\"font-weight: 400;\">focusing on the process and not the product<\/span><\/a><span style=\"font-weight: 400;\">. This advice, thankfully, still applies even with the release of GPT-4. Indeed, we are seeing a number of <\/span><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/how-sydney-academics-are-using-generative-ai-this-semester-in-assessments\/\"><span style=\"font-weight: 400;\">Sydney academics integrating AI into assessments in creative ways<\/span><\/a><span style=\"font-weight: 400;\">, leveraging AI&#8217;s abilities to summarise, suggest, search, and save time.<\/span><\/p>\n<p><span style=\"font-weight: 400;\">GPT-4&#8217;s release perhaps puts more traditional assessments at increased risk, encouraging us to more urgently consider how we can <\/span><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/how-sydney-academics-are-using-generative-ai-this-semester-in-class\/\"><span style=\"font-weight: 400;\">help students productively and responsibly engage with AI<\/span><\/a><span style=\"font-weight: 400;\"> and how we can update our assessment practices to refocus on the process and joy of learning.<\/span><\/p>\n<h2>Tell me more!<\/h2>\n<ul>\n<li><a href=\"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/ai-and-education\/\">Check out our curated resources<\/a> for engaging with AI at Sydney<\/li>\n<\/ul>\n","protected":false},"excerpt":{"rendered":"<p>The internet has exploded again, this time with the release of GPT-4 on Pi day, 2023. But what is this next generation AI, and&#8230;<\/p>\n","protected":false},"author":14,"featured_media":17808,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[56,57],"tags":[2300,110,2503,2505,2542],"coauthors":[463],"class_list":["post-17802","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news-events","category-teaching-tips","tag-artificial-intelligence","tag-assessment","tag-chatgpt","tag-generative-ai","tag-gpt-4","post-item","post-even"],"_links":{"self":[{"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/posts\/17802","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/comments?post=17802"}],"version-history":[{"count":7,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/posts\/17802\/revisions"}],"predecessor-version":[{"id":17820,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/posts\/17802\/revisions\/17820"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/media\/17808"}],"wp:attachment":[{"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/media?parent=17802"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/categories?post=17802"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/tags?post=17802"},{"taxonomy":"author","embeddable":true,"href":"https:\/\/educational-innovation.sydney.edu.au\/teaching@sydney\/wp-json\/wp\/v2\/coauthors?post=17802"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}