CCTP-6057: Intro to AI: Design Principles and Applications: Professor Martin Irvine

Georgetown University
Graduate School of Art and Sciences
Communication, Culture & Technology Program

CCTP-6057: Intro to AI: Design Principles and Applications
Professor Martin Irvine
Spring 2025

This course provides a conceptual and design-oriented introduction to the key concepts for computing, digital information, and data from the first foundations of these technologies to the applications now being developed for Machine Learning (ML), Artificial Intelligence (AI), and Cloud computing. This course is especially designed for students from non-technical backgrounds, and provides methods for “deblackboxing” complex technologies through the key concepts and design principles approach. Students will learn how the technologies can be made accessible through an interdisciplinary framework, which combines Systems Thinking, Design Thinking, Semiotic Theory, and the Ethics and Policy viewpoint. By the end of the course, students will have achieved (1) a conceptual, design-oriented understanding of computing, data, AI/ML, and Cloud systems, and (2) a competency in design thinking and systems thinking that can be applied to any sociotechnical system. Since the ability to communicate conceptually clear and truthful explanations of our technologies across technical and non-technical communities is greatly needed in every field and profession, students who have learned these competencies will be able to take on thought leadership roles in any career that they want to pursue.

Framework and Main Approaches

Every day, the news media, popular discourse, marketing, and advertising are full of statements about AI/ML and Data technologies, but they are treated as unfathomable “black boxes” and corporate-branded products. To reverse this "blackboxing," this course will provide the methods, key concepts, and analytical tools for understanding the designs of the systems, devices, and interfaces that we use every day.

Our learning method comes from applying an interdisciplinary framework, which combines:

(1) “Systems Thinking” to understand how a specific technology is part of a larger, interrelated system (for example, computing systems, kinds of software, networks, and social contexts);

(2) “Design Thinking” for uncovering how and why certain technologies are designed the way they are, including the history of designs and the consequences of design choices;

(3) “Semiotic Thinking” for understanding these technologies as artefacts of human symbolic thought, and how we can delegate some human symbolic process to systems designed to automate them in mathematical and computational "Neural Network" graphs and statistical models.

(4) the “Ethics and Policy” viewpoint for evaluating the social consequences of design choices in the way that technologies are implemented, and for analyzing the implications for ethics and governmental policy..

Outcomes

By the end of the course, students will have achieved (1) a conceptual, design-oriented understanding of computing, data, AI/ML, and Cloud systems, and (2) a competency in design thinking and systems thinking that can be applied to any sociotechnical system. Since the ability to communicate conceptually clear and truthful explanations of our technologies across technical and non-technical communities is greatly needed in every field and profession, students who have learned these competencies will be able to take on thought leadership roles in any career that they want to pursue.

Official Syllabus: with course requirements, expectations, resources, and university policies.

Course Details: Course Format, Requirements, Grading, and Academic Policies

Course Format

The course will be conducted as a seminar and requires each student’s direct participation in the learning objectives for each topic and each week’s class discussions. Each syllabus unit is designed as a building block in the interdisciplinary learning path of the seminar. For each week, students will write a short essay with comments and questions about the readings and topics of the week (posted in the Canvas Discussions module). Students will also work in teams and groups on collaborative in-class projects and group presentations prepared before class meetings.

In addition to in-class discussions and exercises, students will participate in the course by through a suite of Web-based learning platforms and etext resources:

(1) A custom-designed course Website created by the professor for the syllabus, links to readings and instructional videos, and weekly assignments:
https://irvine.georgetown.domains/6057/ [this site].

(2) An e-text course library (shared Google Drive folder) and shared Google Docs. Most readings (and research resources) are available in pdf format in a shared Google Drive folder prepared by the professor (links in the online syllabus). Students will also create and contribute to shared, annotatable Google Docs for certain assignments and discussions (both during class meetings, and working on group projects outside of class).

(3) The Canvas discussion platform for weekly written assignments and discussion.

(4) Zoom video conferencing for virtual office meetings (when necessary or more convenient).

See the full syllabus document (in pdf) for requirements, expectations, and university policies

Grades & Requirements

Grades will be based on:

Weekly short writing assignments (posted to the Canvas Discussions platform) and participation in class discussions (50%). Weekly writing must be posted at least 6 hours before each class so that students will have time to read each other's work before class for a better informed discussion in class.
An individual final "Capstone" research essay in which students can apply their learning in the course in a research project for developing the concepts and methods of the course (50%).

How to fulfill the evaluation criteria (“rubrics”) for each 50% of the grade will be included in the assignment descriptions in the Website syllabus.

Professor's Office Hours
Before class meetings, by appointment, and scheduled Zoom video conferences.

Professor's Contact Information: email: irvinem@georgetown.edu

Books, Readings, Online Resources, Research Tools

Books and Resources

This course will be based on an extensive online library of book chapters and articles in PDF format in a shared Google Drive folder (access only for enrolled students with GU ID). Most readings in each week's unit will be listed with links to pdf texts in the shared folder, or to other online resources in the GU Library.

Required Book:

Peter J. Denning and Craig H. Martell. Great Principles of Computing. Cambridge, MA: The MIT Press, 2015.

Recommended Books:

Alpaydin, Ethem. Machine Learning: The New AI. Rev. ed. Cambridge, MA: MIT Press, 2021.
Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control. New York: Viking, 2019.

Georgetown Library Main Search Page: Get to know and Use

Learn how to search for books, journals, databases, and other media

Required:
Premium paid accounts for OpenAI's ChatGPT, Anthropic's Claude, and Perplexity.

This course has a very low textbook cost; instead, we will use online sources extensively, including the online AI/ML platforms for experimenting and research. The paid accounts provide fuller access to the AI engines and more extensive results. Your prompt dialog is also saved in a "Library."

Required: Use Zotero for managing bibliography and data for references and footnotes.

Zotero link. || Instructions and link to app, Georgetown Library (click open the "Zotero" tab).
You can save, organize, export and copy and paste your references with formatted metadata into any writing project. A life-saver for your graduate studies and beyond!

Recommended: Subscribe to the MIT Technology Review (see Student Subscription)

See section on Artificial Intelligence. Subscribe to email newsletters: The Download (daily tech news) and The Algorithm (news about AI and data).

Course Online Library for Readings and Research Sources
(Google Drive: GU student login required)

Week 1: Orientation to the Course: Learning Goals, Main Topics, and Methods

Orientation to Learning Goals of the Course:

Establishing some useful definitions, distinctions, and scope of subject matter: the key concepts and design principles approach.
Introducing our interdisciplinary framework that provides an integrated method for understanding all kinds of computational systems, and how to apply the method to today's computing systems, networks, AI, and Cloud systems.
Key terms and concepts: What are we talking about when we talk about "Computation," "Computing Systems," "Information," "Data," "Artificial Intelligence," "Machine Learning", "Cloud Computing"? Why clear and consistent vocabulary and concepts is so important.

Introductions

Who are we? Backgrounds and interests to be considered in developing the course.
Your professor's background and research interests: where I'm coming from.

Course Introduction: Requirements, Expectations, Goals

Format of course, requirements, participation, weekly assignments, projects, outcomes (see above).
Using our Web-based syllabus, Canvas Discussion module, online etext library (shared Google Drive).
- Why I use custom-designed websites for courses: teaching philosophy, instructional design, student access to materials.
Course Road Map
- Week Units are Building Blocks.
- Topics and content will be modified and updated as we proceed through the course.
Using laptops in class: most weeks will involve experiments and research in class. Students should bring a laptop (or tablet + keyboard) to class for using AI applications and doing real-time research on our course topics.
- Rules: laptops are for course use; mobile devices and social media are not to be used in class.

Using Research Tools for this Course (and beyond)

Required: Install and learn how to use Zotero for managing bibliography and data for references and footnotes.
- Directions and link to app, Georgetown Library (click open the "Zotero" tab).
  You can save, organize, export and cut and paste your references with formatted metadata into any writing project.
Required: Using the Georgetown Library Main Search Page
https://www.library.georgetown.edu/ (bookmark this).
- Access to journals, books, and databases for further study and research.
Required: Sign up for the premium paid accounts for GenAI platforms
- This course has a very low textbook cost; instead, we will use online sources extensively, and experiment with AI/ML platforms for kinds of results and for research on course topics. The paid accounts provide fuller access to the AI engines and more extensive results. Your prompt dialog is saved in a "Library."
OpenAI's ChatGPT
- See OpenAI's current description of GPT's capabilities
Anthropic's Claude
Perplexity

Introduction to the course: methods and main topics [Presentation; continued to next week.]

Examples for Discussion

We will do a lot of learning by using data and AI/ML applications, and experimenting with the systems to help reveal how they are designed and how they work.
Generative AI Examples (in-class demonstrations and discussion)
- ThisPersonDoesNotExist.com (StyleGAN Nvidia project). "Fakes"?
  - Background on StyleGAN (Generative Adversarial Networks (GANs) (Wikipedia)
- Generated.photos
- OpenAI | ChatGPT-4o + DALL-E | Perplexity
- What are we doing when engaging the generative AI systems? How many systems are being "orchestrated" by design behind the scenes between our inputs and system outputs?
The social, legal, and ethical anxiety around LLMs and generative AI about.
Recent Research Papers and Books: Assumptions and open questions.

Week 2: Introduction to Methods, Key Concepts, and First Steps

Main Topics and Learning Objectives

Context for getting started: AI, Machine Learning (ML), and all uses of Data today are built on the whole history of computing and digital technologies. Further, because AI/ML represent a cumulative summation of the principles of computing and data encoding, learning about AI/ML systems is a great entry point to learning about the key concepts and design principles for computing, information, data, and programming. The built-in history of the technologies also means that we can't understand AI/ML without understanding the basics of computer system design, data design, and the techniques for data analysis that make recent AI/ML possible.

This approach to AI/ML through computer system design principles will also help you understand what is (or may be) possible, and not possible, for AI/ML systems. The knowledge provided by this approach will also give you the conceptual tools for critiquing and correcting false claims and for participating in discussions about AI ethics and policy. This week you will be introduced to our main framework for this approach, and take some first steps in learning how to apply it. We will review:

The learning goals of the course (in the Introduction below).
The interdisciplinary framework and main approaches in the course for "deblackboxing" computing systems, data, AI, and Machine Learning.
How the "Key Concepts" and "Design Principles" approach opens up the technologies for understanding, especially for students without a technical background.
Getting started on asking questions: what are we talking about when we talk about "AI"? "Machine Learning"? "Generative AI"?

Readings and Video Lessons

Prof. Irvine, Introduction to the Course: Our Framework, Methods, and Key Concepts (1)
- Download, read, and print for reference. Don't worry if all this is new and hard to understand now: you can, and will, get it! (We will go over all the key concepts, step by step, in the course.)
Video Lessons: Introducing the Crash Course Series on AI and Computer Science
- The "Crash Course" video lesson series on Computer Science and AI provide excellent introductions to technical concepts and design principles. We will complete the basic "how things work" approaches with the conceptual framework for the "why" and "how is it possible" questions for the technologies we are studying.
- Review the list of lessons for each series below: they are short and you can learn at your own pace. Here are some to get you started with background for the course:
- Crash Course, Computer Science series (list of lessons).
  - For this week: view the Preview, then Lessons 1-3 (2-4 in playlist)
- Crash Course, Artificial Intelligence series (list of lessons).
  - For this week: View the Preview and Lesson 1: What is AI?
Class discussion: Clarifying Key Terms and Concepts: Artificial / Machine / Intelligence / Learning

Examples for Discussion (in class): First Steps for Deblackboxing

Experiments with generative AI platforms: ChatGPT and Perplexity.
- Experiment on definitions in class: prompting ChatGPT and Perplexity for definitions and explanations. What questions can you ask about what is happening between our inputs and the system outputs?

Prof. Irvine, Introduction: Methods, Topics and Key Concepts of the Course (Slides)

Writing assignment (Canvas Discussions Link)

Read the Instructions for the weekly writing assignment..
This week provides a "top level" overview of the topics and main approaches in the course. In the following weeks, we will study the key concepts and design principles that explain the "how" and "why" of our data and AI/ML technologies.
Your first discussion post this week can be informal. In this week's post, write some notes about your main "take aways" and questions from the readings (and video lessons). Any "Aha!" moments when something became understandable, or a bit less "black-boxed"? What questions do you have about the main concepts and topics of the course? What would you like to have explained more in class?

Week 3: Computer System Design Principles & Introduction to AI/ML Systems

Learning Objectives and Main Topics:
Fundamentals of Computer System Design and AI/ML

This week provides an overview of computer system design as a foundation for learning about AI and ML. The video lessons and introductory readings will help you begin developing your own knowledge base of terms and concepts that you can use in our deblackboxing goals.
Your learning goal for this week is to begin understanding basic computing design principles, so that you can take the next learning steps for understanding how and why all our contemporary systems -- from any kind of computer system and digital media to AI/ML and Cloud systems -- are designed. After the basic introductions, you will also learn how systems theory/thinking enables us to design large systems in scalable layers or levels using many kinds of interconnected subsystems (lower levels systems serving the whole system). All these essential design principles (and the methods for implementing them in manufactured components and programs), of course, are "inside" many nested "black boxes," and we only observe the results. But we can begin to understand how and why our data and AI/ML systems work the way that we perceive them to work by opening up some of the fundamental design principles of computing systems, programming, and data encoding. There is no magic, there are no "hidden" mysteries -- only human designs for complex systems that serve our core human capabilities!
Key Questions: beginning this week and following up throughout the course:
- What is a (digital binary) computer? What makes a computer a computer? Why should we always say "computer system" rather than just "computer"?
- How is everything we are doing today with data, AI and ML computationally possible?
- How did we get here (in computing and AI)?
- How do computer system design principles allow for scaling up processors and memory capacity to facilitate the huge "compute" power needed for data and processing in today's AI/ML systems?

Readings and Video Lessons:

Prof. Irvine, (Video) "Introduction to Computer System Design" [in settings, switch to 720p or 1080p]
- I made this video for CCT's Intro course (5005), but also for introducing the "why" and "how" of digital computer system design for students in any course.
- You can study the presentation in the Google Slides version, if you want to go at your own pace.
Prof. Irvine, Introduction (Part 2), The Semiotic Systems View of Computing and AI (goes with the video and slide presentation above).
Video Lessons
- Crash Course Computer Science: Study Lessons no. 4-8 (5-9 in Playlist)
- You can study these great short lessons at your own pace from this week on. We will refer to the foundational throughout the course, though we won't be able discuss each unit.
Peter J. Denning and Craig H. Martell, Great Principles of Computing. MIT Press, 2015 (in pdf). Read chapters 1-2 for this week. In Chap. 2, on the major "Domains" of computing, note the sections on "Artificial Intelligence," "Cloud Computing," and "Big Data."
- We will refer to the computing principles outlined in this book throughout the course. Even though the book is for non-specialists, much may be new to you, and it will take time for the terms and concepts to become yours to think with. That's normal and OK. Read and re-read it as we progress through the course.
Scaling Computer Architecture: Cloud Data Centers for AI/ML
- Video: Cloud Computing Explained (very brief overview).
- Video: How ChatGPT runs on Microsoft's Azure Cloud system
- We won't be able to cover the fascinating history of how we were able to scale up and extend the basic ideas in digital systems design, and then connect unlimited numbers of computers via a digital network (the Internet). But at this point, doing a quick fast-forward, you should have a brief view of how cloud data centers are designed to combine all the elements of networked computing systems to scale up to any level of complex combinations. We now cluster together thousands of processors with billions of memory units (active RAM + storage) for all kinds of programming and data. The design and development of data centers around the world has provided the computing power needed for the recent successes in ML/AI systems. (More later.)
Ethem Alpaydin, Machine Learning. Revised edition. MIT Press, 2021.
- Read the Preface and Chap. 1 (to p.34). (We will consult this book later in the course.)
- We are beginning the study of AI/ML in the context of computing so that you have a view of how computational processes in programs must be designed to perform the multi-level computations in AI/ML applications.
- Question: Is there a useful distinction between "AI" and "Machine Learning" (ML)? (Today, most of what is called "AI" is ML. Does this matter?)
Prof. Irvine, Key Terms and Concepts Glossary (download, print, and annotate).

Examples for discussion in class:

I asked this question to ChatGPT-4o: "Why do people attribute human characteristics to AI systems and computers?" Here is the response (go to link).

Class Discussion: Clarifying Terms and Concepts

Writing assignment (Canvas Discussions Link)

You will be learning more about computer system and data design through the course, and this week is for helping you get started and going on to learn more. Consider one or two of the computer system design principles (from the video lessons and readings) that help you "deblackbox" why modern computer systems embody these specific kinds of designs. (Much will still be new and difficult, and you will have many questions: take notes to capture your questions). Could you discover some of the "why" and "how" behind the technical descriptions (in the video lessons and readings)? What terms and concepts that you learned helped you to "deblackbox" what seems closed and inaccessible? What do we need to go over more fully in class?

Week 4: Data Design (1) : Text Encoding: How Written Symbols Become Computable Data

Learning Objectives and Topics:

In Weeks 4 and 5, students will learn the foundational concepts of "data" as defined in digital electronics, computing, data sciences, and AI/ML. We will focusing on text and image data, the most widely used types of digital data, and the focus of AI/ML applications and research.

At the end of week 5, you will be able to answer these questions: How is it possible to encode the two key forms of human symbolic expression -- written characters (signs) and images -- so that they can become computable objects (data) for digital computer systems and AI/ML processes? How have the now-standardized methods for encoding text and images become the key for creating computable data sets used in all AI/ML applications?

This week we will focus on text encoding -- how written text becomes computable in binary representations that programs can interpret and process. Standard formats for encoding text characters are the foundation for Natural Language Processing (NLP) and all text-based data in automated systems: Machine Translation, Large Language Models (LLMs) like ChatGPT, and speech recognition systems (which do fast feature recognition on digitized audio to generate encoded text).

Video Lessons and Readings (in this order):

Prof. Irvine, Introduction to Data and Text Encoding
Video Lessons

Basics Explained: Why Do Computers Use 1s and 0s: Binary Explained
Computer Science: Digital Character Sets: ASCII to Unicode
Unicode YouTube Channel: Introduction to Unicode and Text Encoding [really nicely done!]

Digital Text Encoding for all Languages: Unicode

The Wikipedia overview of Unicode and digital character encoding is useful.
For Reference: The Unicode Consortium Official Site
- The Current Unicode Standard [now Version 16.0, as of Sept. 2024]
- Unicode 16:0 Code Charts (for all languages)
The Unicode Standard -- Core Specification (Current Version) (also available in pdf)
- Read Chap. 2: Architectural Context and Unicode Design Principles
- Survey the rest of the contents (as you have time or interest)
For reference: Unicode Character Code Charts for All Languages [current version, 16.0]

UTF-8 (Unicode Transformation Format - 8 Byte Units). See the definition of "UTF." UTF-8 is the most commonly used code format for European alphabet-based language (including the character encoding of the Web page in your current screen display "window"). (Unicode also has extended 16 and 32-bit byte units for other languages and scripts.)
For "Han" Unified CJK (Chinese, Japanese, and Korean) Ideographs and Scripts (pdf reference file). (There are many extensions of encodings for Asian scripts in the Code Chart.)
- (On "Unified CJK" see Wikipedia.) Note that the Unicode practice is to decompose scripts and ideographic writing systems in the smallest meaningful "glyphs" (strokes, marks), which can be assigned a byte unit or code variation, which can then be composed (combined) in a Unicode code point definition.

Unicode Emoji (pictographic symbols) | Code Chart List | See the Full Emoji List

Yes! All emoji are encoded as "characters" and not "images." Emoji are encoded as Unicode bytecode (in a range of "code point" numbers) to be interpreted in software and then projected in any device's graphics processing and screen.
See the Current Unicode Full Emoji Chart with Modifiers (for skin tone and other modifications).
The Unicode "test file" of all currently defined "emojis" (v.16.0) (with byte code, symbol, and description) (long file).
- Note: This is a "plain text" file (in Unicode characters, of course) with Emoji symbols encoded to test how they are interpreted and displayed with the software and graphics rendering of various systems. You may find some don't display in your Web browser or app, and the Emojis display differently from device to device. Why? (Think about each device's OS, graphics hardware and software, and type of screen.)

Unicode "behind the scenes" in applications (In-class exercises)

Unicode lookup tools
Google Translate
Looking ahead: from Unicode in data sets to "tokenization" in LLMs.

Writing assignment (Canvas Discussions Link)

Referring to at least two of the readings or sources:

Discuss your main "take-aways" in what you learned about how text can become a form of digital (computable) data, and about Unicode as a design solution for an encoding standard. Can you explain how the Unicode standard allows us to encode written characters and symbols in any language through an internationally adopted "byte code" format that can be recognized by all programs and display software in any computing system and digital device? Questions? What would you like to have explained further in class?

Week 5: Data Design (2): Digital Images: Encoding Images as Computable Data

Learning Objectives and Main Topics:

Understanding Digital Images as Data: Foundations for AI/ML

Last week you learned about the major data type: text (or "string") data, and how a standard encoding format (Unicode) enables a standard set of defined byte units to be interpretable across any kind of computer system and software programmed for text data. The standardized encoding format is what enables the NLP processes for collecting text data sets used in Machine Learning for training all LLMs

This week, we will study how digital images are encoded as forms of data. The ML techniques for analyzing, classifying, and labeling image data is different from those used for text data because digital images (mainly as photos) are encoded as mathematical formulas for the geometry of two-dimensional objects.

Before going further inside the "black boxes" for ML/AI image analysis, transformation, and generation, it's really important to understand how photos and images become digital data, and how digital images are encoded. We need to answer basic questions:

What is a "digital photo"? How are photo images encoded as digital data?
How can software be designed to create, interpret, change, and transform a digital photo image?
What kinds of further data (metadata) goes with a digital image file, and how is this interpreted in software and in ML processes?

Next week we will build on tis foundation for understanding pattern recognition and classifying techniques for digital photo data used in ML.

Readings and Video Lessons:

Prof. Irvine, Introduction to Digital Images as Data: How Digital Photos are Encoded
Video Lessons
- Crash Course Computer Science: View Lessons 15 (Data Structures) and 21 (File Systems). In Lesson 21, notice how the digital representation techniques for sound and images are different from text, and note how digital image data is structured in data file formats.
- Code.org: Images, Pixels and RGB (by a co-founder of Instagram)
- Branch Education: How do Smart Phone Cameras Work?
  - Good "deblackboxing" view, and excellent background on light and human vision. Sections may seem like too much technical detail, but there's a lot of useful knowledge here.
Ron White and Timothy Downs, How Digital Photography Works. 2nd ed. Indianapolis, IN: Que Publishing, 2007. Excerpt.
- Well-illustrated presentation of the basics of the digital camera and the digital image. The digital image principles are the same for miniature cameras in smart phones.
- Study pages on "how light becomes data," and "how images become data," and Chap. 9 on how software works with digital images. This background is important for understanding how AI/ML algorithms can be programmed to interpret and analyze image data.

Optional: A little more technical background if you have time and want to go further:
Video lessons from Computerphile:
- How are Digital Images Represented as Computer Data?
- How are Digital Images Captured in Digital Cameras?

Using Images as Data in AI Systems (In-Class exercises)

Writing assignment (Canvas Discussions Link)

This assignment will help you understand what a digital photo is as a data type, and how a digital photo in a file format (.jpg, .gif, etc.) becomes computable data in all software contexts and in AI/ML applications. Use these steps for preparing your discussion post:
(1) Choose a digital photo image that you've taken -- or take a new one -- as an example to study. Use a digital camera if you have one, or a mobile phone photo. Copy the image file to a folder in your PC/Mac (by whatever method you use). [If you choose a photo from a mobile device, you can first upload the file to a storage location (e.g., cloud) and then download it to your Mac/PC; or use a Bluetooth file transfer app; or send it to yourself in email, so that you can save it to your PC/Mac for using it in software.] Then, use any photo image viewer you have on your PC/Mac to view the image on your screen. If your photo viewer has editing and modification routines (like filters or styles, morphing, etc.), make some changes and save the altered image.
Posting (Part 1): Upload your image (and any edited or modified versions) at the beginning of your post, and write your discussion comments below your photo example(s). For the first part of your post, refer to the readings and video lessons for this week and explain the steps in the digital data process from taking the photo with a digital camera (lens + sensor), through the stages in memory locations in different devices, and then to the viewable image displayed on your screen with your image viewer program.
(2) Next, find out as much as you can about the properties of your image in the metadata for the file. If you know how to use Photoshop (or a similar digital photo editing program), you can display the "Information" in the "metadata" in the image file. You can also view some of the metadata in the image file in the "information" in the Mac OS file view, or in the "Properties" of a file in a Windows PC folder. Copy or transcribe the info into your post.
- Easy way: Or you can use this good free online image viewer: Forensically (Image Analyzer). Upload or drag and drop your file from where you saved it into the app window. The "Magnifier" provides a pixel-element enlarger. Click on Metadata to view the metadata in your file. You can highlight and copy and paste the text data.
Second part of your post: Copy some of the main metadata fields into your post, and comment on what you found (as best you can). Have fun!

Week 6: Introduction to Pattern Recognition: The Foundation of AI & ML

Learning Objectives and Main Topics

Goal: Learning the foundational principles for Pattern Recognition in AI/ML with the "Artificial Neural Net" (ANN) calculation methods for features and properties in a digital image, and how basic feature and pattern recognition techniques are used in digital image analysis.
- For this major topic, you will have an introduction to how ANNs are designed in ML for simulating the human processes of detecting features and recognizing patterns in our two main symbolic systems: written characters (symbols) and images. This week we will focus on digital photo images, building on what you learned last week.
This week, you will be able to connect last week's study of digital photo images as computable data with the ML techniques for pattern recognition and feature analysis in ML.
Pattern Recognition (which includes a first step in Feature Detection) is the main problem domain of AI/ML: how can we design computational systems that "recognize" (can classify, categorize, or label) the regularities or invariants (= "patterns") in all kinds of data for analysis (interpretations) and predications about new data of the same type. An essential step in ML is therefore analyzing features and patterns in selected finite sets of data ("training data sets") to develop models that can be applied to analyzing new instances of the same kind of data.
- This main problem domain has created a large multi- and inter-disciplinary field that combines computer science with philosophy, logic, mathematics and statistics, linguistics, and cognitive psychology.
In ML methods, detecting "features" and recognizing "patterns" in all the varying kinds of properties in data is based on logical-mathematical models for establishing statistical invariance (getting better and more finely tuned approximations to properties of data that indicate patterns, what stays the same over millions of variable instances). We will never have 100% absolutely true or false feature detection and pattern recognition, but the methods in AI and ML are designs for getting "close enough" (i.e., useful) approximations on which interpretations can be made (on already collected data in the finite set) and on which decisions can be made about new, future data instances, and then (for generative AI) using the whole network of data property approximations to project (generate) new representations from the patterns in the data (e.g., text and image generators).
What's "inside" the AI/ML Black Boxes? You'll discover there is nothing "black" inside a complex system for an AI/ML application (just lots of human-designed and human produced expression: advanced math and programming code, large data sets of text and images that we have produced represented in computable digital formats, and large computer systems with many fast processors and almost unlimited memory capacity).
- Continuing our basic approach, de-blackboxing for the steps in ML/AI means making the invisible visible, gaining access to what is unobservable in AI/ML computational processes as implementations of the design principles for AI/ML.

Readings and Video Lessons [in this order]

For an overview of the importance of Pattern Recognition and the major approaches in computing and AI/ML, see the Wikipedia article.
Geoff Dougherty, Pattern Recognition and Classification: An Introduction (New York: Springer, 2012). Excerpts from Chaps. 1-2.
- Read the Introduction and look at the examples and methods for pattern classification in Chap. 2. This is the background assumed for the NN/ML application to "Selfie" photo image analysis in Karpathy's tutorial essay (below).
Crash Course Computer Science
- Video Lesson 34: Machine Learning & Artificial Intelligence. [A good overview before studying specific ML applications. Notice how most of the examples are about pattern recognition in different contexts.]
- Video Lesson 35: Computer Vision. [Further background on the designs for digital images, the foundations for pattern recognition in images.]
Crash Course AI
- Video Lesson 3: Introduction to Neural Networks and Deep Learning
- Video Lesson 4: Training Neural Network
3Blue1Brown: Video Lesson: What is a "Neural Network" in ML?
- This is the best animated math description I've seen for explaining the basic designs for ML "Artificial Neural Networks" (ANNs) and why and how they work. Even if you can't keep up with the math (that's OK), the visualizations will help you understand how the ANNs are designed with human logic and math, and why/how ML is possible by using graph models and matrices (groups of ordered numbers) for statistical calculations. These models are now represented in the special linear notation of code for functions and processes in a programming language, which is then applied to interpreting the data.
- Note: The excellent lessons in the 3Blue1Brown Machine Learning series are also on the producer's site with transcriptions; view others on your own over the next few weeks.
For reference (1): Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach. 4th ed., 2022. Survey the Table of Contents and survey Chap. 1.
- This is the most widely used (and cited) textbook for AI/ML, now in a new edition with background on recent developments in Neural Net systems. [You are not expected to understand the topics in this selection from the book; just do your best to take in the key topics.] It will all seem overwhelming now, but with this overview you can begin to appreciate all the work that has brought us to where we are today. This background will be assumed in the approaches to topics that you will study in the rest of the course.
For reference (2) Ethem Alpaydin, Machine Learning. Rev. ed. MIT Press, 2021. [Download pdf.] For this week and beyond, refer to Chap. 3, Pattern Recognition (71-103). The whole book is a good short overview of Machine Learning (but not all AI methods).

De-Blackboxing Case Study: Background for this Week's Assignment
Neural Net Pattern Recognition for Analyzing "Selfie" Photos

Andrej Karpathy, “What a Deep Neural Network Thinks About Your #selfie,” Andrej Karpathy Blog (blog), October 25, 2015.
- This mini-tutorial was written at an earlier stage in the ANN ML methods, but it remains a really useful introduction by one of the designers. At Stanford, Karpathy was a student and colleague of Fei-Fei Li (developer of ImageNet, the breakthrough database used for training photo-image ML models), one of the co-founders of OpenAI, briefly head of AI at Tesla, and now working on his own company. In this essay, he provides an "inside the blackbox" view of how CNNs, "Convolutional Neural Networks," are designed for doing feature detection, pattern recognition, and probabilistic inferences and predications for classifying photo images. CNNs are a version of the ANN mathematical graphs that can continually readjust the "weights" (values to be used in calculations) in the nodes for the probabilistic calculations. This is a major statistical method in ML for computing over features (also called attributes, properties, and parameters) as directed in the programmed algorithms.
- [Karpathy is one of the smartest and most highly regarded people working in ML/AI, and he is very generous about sharing his knowledge and making the field accessible to others. Read anything he has written or watch his videos and you will be very happy about the time you spent.]
- Background Notes: All ANN ML techniques are based on models in mathematical graphs and matrices for multiple layers of algorithms that can work in parallel and recursively (feeding outputs back in). This provides mathematical models for implementing multiple weighted statistical calculations over indexed features (the features identified, automatically or predefined). The "nodes" in the graphs represent a part of a statistical calculation with many interim results, which can be fed back into the calculating nodes, and "fine tuned" for targeted output approximations. All the formal algorithms must be encoded in a programming language, and then "written" into the memory of a computer system to become a runnable/running program.
- By looking a little inside the "blackbox," you can also get a sense of the ethical implications of programming criteria (the parameters defined for selecting and "weighting" certain kinds of features and patterns) when the ML design is automatically "tuned" for certain kinds of results.

Writing assignment (Canvas Discussions Link)

With the background in this week's video lessons and readings, describe Karpathy's explanation for his experiment in designing an ANN for "evaluating" selfie photos based on social media "likes"? Discuss what you find to be the key points and issues in Karpathy's deblackboxing of the algorithms that he used in his selfie analyzer. Of course, there will be much that is new and difficult this week, but do your best to connect what you've learned last week and this week in your discussion.
Note: In your reading and discussion this week, don't be put off by the mathematical and technical models introduced in the readings and video lessons. You will discover that the mathematics and logic in the graph and matrix models for the algorithms (automatable calculations) are all designed in the service of our cognitive abilities for recognizing and interpreting patterns, and then making predictions, based on the already "learned" (memorized regularities) patterns, for analyzing the features and patterns in new, future data. All the data and programming methods are finally about what kinds (genres) of photo images mean to us human interpreters in our meaning communities. Because it's all by human design, there's nothing "black" in the unobservable AI/ML "black boxes."

Week 7: Introduction to Natural Language Processing (NLP) for AI/ML

Learning Objectives and Main Topics

This week students will learn the basic principles of Natural Language Processing (NLP), which is based on pattern recognition techniques for both text analysis and speech recognition with AI/ML methods. NLP is a huge, multidisciplinary field, and now combines research and methods from Computational Linguistics, Machine Learning and Machine Translation, and recent Neural Net models. NLP in AI/ML applications are designed with a suite of computational techniques for sorting words, mapping syntactic patterns, labeling words in groups with frequently occurring "neighbor" words (n-grams), translation from a "source" to a "target" language, and processing text-to-speech and speech-to-text. We can only introduce a few of the topics and techniques in NLP this week, so that you can extend and apply what you have learned about text encoding and text as computable data.

Definitions: Natural Language means the human languages acquired naturally by being born in a language community, as distinguished from "second-order" or "artificial," "formal," "languages" developed for mathematics, sciences, computing, philosophy, and other fields. "Formal languages" are systems of specialized terms, symbols, and notation, like the symbols and terms used in mathematics, logic, linguistics, and computer code. Some "formal languages" are also termed "metalanguages," i.e., specialized "languages" about other languages. Computer programming "code" is a metalanguage for creating interpretations of data (through algorithms) represented in further data as "outputs".

Background: The designs for NLP in AI/ML applications today have been developed by combining knowledge and techniques from linguistics, mathematics, logic, data science, and core computer science. All NLP applications in programs are built up with many kinds of formal "metalanguages" (from logic and mathematics) for defining, describing, and analyzing instances of natural language (spoken and written) as data. Generative AI programs (like OpenAI's GPT) are designed with many layers of NLP processes for creating data sets, combined with Artificial Neural Network (ANN) algorithms for statistical patterns, probabilities, and predictions for "generative" projections (in outputs).

There is a long history of research, theory, and methods in Computational Linguistics ("Comp Ling":) since the 1950s. Comp Ling includes methods for interpreting large bodies of text (text "corpora"), parsing (deducing the grammatical structures [syntax] of sentences in all languages), and semantics (mapping networks of meaning for words and phrases). These topics continue to be important research problems in themselves, and not necessarily part of an AI goal, but the research problems and working models in Comp Ling have recently been overshadowed by (but not replaced by) the statistical, big-data, Neural Network Machine Learning methods. The methods and practices of both traditions have now merged into one large multidisciplinary field, and there is (as yet) no complete textbook introduction to the current state of the whole NLP field.

Readings and Video Lessons:

General Background on Natural Language Processing for AI/ML
- See Wikipedia background article on NLP, and on Computational Linguistics.
Andrew Ng and DeepLearning.ai: Introduction to Natural Language Processing [pdf]
- [The web page seems to be not available now. The link above is to a pdf file that I made from the file in the Internet Archive Wayback Machine [link] (will be slower).
  This is a really good summary introduction to the field by one of the leaders in Neural Net ML/AI. After years of research and teaching at Stanford (he's a colleague and friend of Karpathy), he founded Coursera, the online learning platform company, with the goal of democratizing education. His recent company, DeepLearning.ai, hosts many open courses on AI (which you can enroll in if you want to go further).
Video Lessons:
- Crash Course Computer Science, Lesson 36: Natural Language Processing.
- Crash Course AI, Lesson 7: Natural Language Processing

Background: Research and Methods under development before Neural Net Machine Learning

Labeled and Linked Data Sets for Semantic Networks of Words
(before ANN methods)
- WordNet | See the Wikipedia article for background.
- Try using the Open English WordNet: An API to search English words in WordNet
- The WordNet project aimed to create a linked database of millions of words with labels (metadata) for lexical (with syntax features) and semantic concepts in networks of relations. The database has an API that is still used in many applications, although Neural Net methods with massive data sets (possible with development in computer architecture and memory since c.2005) have mostly replaced the WordNet model.
- BabelNet | See About BabelNet | See the Wikipedia article for background on BabelNet project.
- BabelNet expands on the WordNet model for a linked semantic network database for words and concepts in many languages, and the platform can translate across many languages and provide deeper possible semantic relations.
- Try out a visualization of a semantic network in BabelNet: Type a word in the input field. The interface will send your word token to the database for fetching the whole semantic network of the word as the output. (There will usually be many use contexts or a word, so click on one of the uses/contexts that are returned). Next, click on the "Explore Network" icon for an interactive graph of semantic relations and contexts. Here is an example using the word "information" (click on "Explore Network"). Mouse over or click on the yellow dots on the right side (nodes), and you can also mouse over and zoom in/out over any of the nodes.
  - There is also a mobile app for BabelNet now.

Important NLP Method for Machine Learning: Tokenizing
Identifying the word-units in a data set with OpenAI's Tokenizer

Background: The term "token" is adapted from C. S. Peirce's semiotic terminology: A token is a physical, material, perceptible instance of a symbolic form, pattern, or type. Unlimited instances of tokens can be generated from the abstract pattern, form, or type (letters, words, image genres, etc.). (When we see the "word count" for a doc in MS Word or Google Docs, this is a count of the "word tokens.") In NLP, and now in LLMs, the term has come to mean the type of indexed word form that results from analyzing a data set for word occurrences and frequencies, which results in minimal, indexed units, like a dictionary entry. So, to "tokenize" in current NLP in ML is (in Peirce's original sense) to de-tokenize, to resolve all the millions of word instances into single entities (like the "word" list in a dictionary). Creating an index of these data entities ("tokens" now considered as dictionary-like word and sub-word units) is the starting point for all ML statistical analysis of frequencies and word-units patterns.
Using OpenAI's Tokenizer: In preparation for your discussion post.
- First, click on the "Show example" in the top input box on the Tokenizer page. In the output display box below, you will view a visualization of the Unicode character strings grouped into word and sub-word units (tokens). Next, click on "Token IDs" in the bottom output box. For the example, the interface returns a view of the tokens as word and sub-word instances (highlighted to show the token units used in the GPT model). Next, click on "Token IDs" in the tab in bottom output box for a view of the tokens as indexed with "ID" numbers in the text data set that OpenAI has processed in a GPT model. In the current interface, you can choose 3 different language models for the word tokens and their ID numbers in the data set.
- Note: All NLP processes are based on word tokenization in Unicode character-encoded strings. For LLMs (modeling from billions of word instances), it's faster to compute the word-tokens as numbers representing an index "ID" number in the whole data set (which will have many multiple words instances and clusters of word occurrences). The index numbers are then used for computing over the many ordered patterns of 2, 3, 4, and more word frequencies (phrases most frequently occurring), and all other statistical analyses.
- Note: This means that GPT and all LLMs (Large Language Models) have never "seen" (interpreted, processed, computed) a "word" in our understanding of a "word." All the statistical calculations are done with token ID numbers, and for generated outputs, the indexed token-IDs are first calculated for producing the probable connected strings, and then the IDs are converted back to their indexed word units represented in Unicode characters (including the Unicode bytecode for spaces, punctuation marks, and all other symbols). The final Unicode character sequences are the "generated text" that we interpret in the "outputs" from the LLM in our Web interface.
- We will study the methods used for ChatCPT and LLMs next week.

NLP "Machine Translation" Example: Google Translate

See the Wikipedia article for background and history of "Machine Translation."
Google Translate: See the Wikipedia article on the history of Google Translate, and on Google's recent "neural machine translation" model [useful background].
Code Emporium Video Lesson: How Google Translate's AI/ML Works.
Google Translate (translate.google.com)
- Google's announcement of the neural network model for the new version of Translate (2016) (migrating from "phrase-based" translation methods), and product announcement for Translate for mobile apps and Neural Machine Translation (2017). Here is Google's official blog about Translate.

Google Translate mobile app: uses smartphone camera and microphone for translating text and speech.
You are probably already familiar with using Google Translate, so continue to the "hands-on" exercise for your discussion post below.

Writing assignment (Canvas Discussions Link)
NLP "Machine Translation" and Tokenizing Case Study: Google Translate, ChatGPT-4o, and Perplexity

Learning from DIY ("Do it Yourself") examples. Four steps:
(1) Copy 4-5 sentences (from your own writing or from any paragraph of text) to use as a source text example for three "Machine Translation" applications (Google Translate, ChatGPT-4o and Perplexity). Use English for either your source or target language, so that everyone can follow your experiment in machine translation with the word tokens used.
- For Google Translate, choose your "source" and "target" languages (the "input" and "output" languages), and paste your example text in the left input ("source") box. Copy the inputs and outputs for your discussion post.
- For the LLMs, use the prompt: Translate the following text from [language] to [language]: "[insert your text in quotation marks]". Quotation marks help the LLM token interpreter to identify the word token strings (in Unicode, of course) to translate, as differentiated from the tokens in your prompt.
(2) In your discussion post, copy and paste your example source sentences and the three versions of translations in the target language, and discuss how the three ML translations compare. For ChatGPT, you can even try different language models to see if a different GPT model does "better" on the translation.
(3) Next, use OpenAI's Tokenizer tool to view the word tokens as they are indexed in the GPT LLM. Paste your example text in the "Enter some text" box (you can view how the different models are programmed to "tokenize" and index the word units). For your post, copy and paste (or use a screenshot) of your example text and GPT's tokens, and likewise, copy/paste (or screen shot) of the Token IDs. Include the Tokenizer's representations for both your example input text and translation text (from one of the Machine Translation sources) for both the highlighted tokens and the Token IDs.
(4) Discussion: Using the background in this week's readings and video lessons, describe some of the NLP processes inside the "black boxes" of the translation process and the tokenizing method for indexing word-forms. Of course, Unicode is behind all the "visualizations" (screen representations) of tokens (and labels and index numbers). Word-unit decomposition into "tokens" will be in Unicode byte-code code-point representations (the encoded units represented as data in any system's memory and interpreted internally by programs using the data). There is, of course, a lot more complexity in the layers of the processes, but see how much you can describe for what happens between the inputs and outputs in your example.

Week 8: From Chatbots to Large Language Models (LLMs) for "Conversational AI"

Learning Objectives and Main Topics:

"The hottest new programming language is English." --Andrej Karpathy

Learning the background history and design principles for "Chatbots" and online virtual assistants (e.g., Siri, Alexa, Google Assistant) that have preceded, but provided "proof of concept" for developments in recent ML LLMs (Large Language Models) for Gen AI (Generative AI) systems (ChatGPT, Anthropic's Claude, Perplexity, and platforms by Google and Meta).

Background: "Virtual Assistant" systems (Siri, Google Assistant, Alexa, etc.) and recent "Gen AI" developments (ChatGPT) are often categorized as "Chat Bots," i.e., "Robot(ic) [Automated] Virtual Chat" systems. It's a term from earlier Q&A text-based systems, but the newer LLM-based "generative" systems (designed with "Deep" layers of Neural Net nodes) are very different from the earlier systems (though still based on the input-output architecture).

Background on ChatBots and Dialogic Systems up to ChatGPT

See Wikipedia backgrounds for quick overviews:
- Chatbot [Wikipedia flags this as out-of-date, but background history is OK]
- Virtual Assistants (Siri, Amazon Alexa,Google Assistant; mostly business applications)
Video Lessons
- From the First Chatbot to ChatGPT (The Verge) [ha! the presenter says "1966 was along time ago"! But this is a good overview of the background history.]
- Code.org: How Chatbots and LLMs Work [simplified introduction]
Research papers on arxiv.org. This is the most important open-access service for articles and conference papers in every area of computing, AI/ML, and other sciences. Papers are published here before appearing in journals and other publications. Learn how to do research on this site. Here are two recent papers on chatbots and virtual assistants with good detailed background. Click on "View PDF" on the right. Survey for the abstracts, main headings, and as much of the technical details that interest you.
- Sumit Kumar Dam et al., “A Complete Survey on LLM-Based AI Chatbots” (arXiv, June 17, 2024), doi:10.48550/arXiv.2406.16937.
- Guendalina Caldarini, Sardar Jaf, and Kenneth McGarry, “A Literature Survey of Recent Advances in Chatbots,” Information 13, no. 1 (January 15, 2022): 41, doi:10.3390/info13010041 (arxiv.org).

Design Principles for Chatbots and Virtual Assistants: From the Primary Sources

Survey these sources for background on the "dialogic" systems from c. 1966 to c. 2020 and development of LLMs. The already established technologies for Chatbots and Assistants have now merged -- and have been updated with -- ML for NLP, including combinations with LLMs and Gen AI (for example, Siri and Alexa). We find the automated "dialogic/conversational agent" or "chatbot" technologies in many sectors: banking, online shopping, phone service (many applications are built on AWS Amazon Lex, the ML platform behind Alexa).
ELIZA: first automated dialogic "chatbot." See Wikipedia background on ELIZA
- View and download the original article: Joseph Weizenbaum, “Eliza -- A Computer Program for the Study of Natural Language Communication Between Man and Machine,” Communications of the ACM 9, no. 1 (January 1, 1966): 36-45.
ELIZA emulator [use for assignment below]
- This site provides an "emulator" (a software virtualized version of another platform) of the original ELIZA system developed by Joseph Weizenbaum in 1966. See the background below the dialog window, and the JavaScript code that this emulator is written in.
- Try out some dialog with the system. You can try the different terminal interfaces, but the main one is the most efficient. This is a "rule-based" system with pre-configured vocabulary, phrases, and minimal grammatical structures. These kinds of "chatbots" cannot be "generative," but are designed to produce outputs from from words and phrases already configured in limited files.
Apple Siri (link) (scroll down to "Apple Intelligence")
- Wikipedia background ["Siri" as an voice and text NLP system has an interesting history of development before Apple.]
- Apple's Patent Application for "An intelligent automated assistant system" (US Patent Office, 2011). Apple has developed and updated Siri continuously since 2011.
  [Scroll down to read the description of the "invention", and/or download the whole document. Note the block diagrams for the layers of the system. There are now many patent lawsuits going on for "prior art" in Siri and speech recognition systems.
  [See also: Apple's World Intellectual Property Organization (WIPO) Patent, "Intelligent assistant for home automation" (2015).]
- Abstract of the system in the 2011 patent:
  "An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact."
- Siri is now part of Apple's "Apple Intelligence" division (for products), and part of Apple Machine Learning Research and Speech and NLP [view site]. Background from Apple Research and Apple Machine Learning Journal:
  - "Hey Siri: An On-device DNN-powered Voice Trigger for Apple’s Personal Assistant" (October, 2017). [Description of how Apple redesigned Siri with a "Neural Net" Speech Recognition system in 2017.]
  - Voice Trigger System for Siri (August, 2023).
Google Assistant (link): Google Assistant Homepage [company info and app integrations]
[This is for background and reference; not part of this week's assignment, but we will discuss in class.]
- Google Assistant: Wikipedia background [mostly business and product information]
- Google Assistant for Developers (Google)
Amazon Lex (link). AWS (Amazon Web Services Cloud) includes customizable automated "chat" services for every business sector: "Build and deploy conversational AI interfaces with Amazon Lex". This is the Cloud server-side technology behind Alexa (in consumer devices). The web page for Lex itself has a pop-up chat dialog box!
[This is for background and reference; not part of this week's assignment, but we will discuss in class.]
- AWS Lex Resources page: videos and background on applications for different sectors

Writing assignment (Canvas Discussions Link)

Capture some simple dialog exchanges with the ELIZA system, Apple's Siri (or another mobile chat-based app), and a brief question-and-answer dialog with ChatGPT. (If you have a different mobile phone that doesn't have Siri, use a voice "assistant" in the device you have.) Copy and paste some of your example dialog strings in your post.
Based on the context and background in the readings and video lessons this week, describe, as best as you can, the design of the "chat" automated dialog in the different systems:
(1) ELIZA, a "rule-based" system with a small set of vocabulary items (words) and phrases, and rules for embedding parts of input strings into an output (typically worded as a question in ELIZA).
(2) Siri, at first, more of a rule-based "bot" (automated dialog agent) configured for voice prompts for Apple services. For example, "Siri, play [song, musician]", which was designed to trigger a search for, then play with the on-device app, a downloaded track from the Apple store. Over the past few years, Apple has redesigned the voice prompt service and re-engineered the server-side back-end for ML and "Apple Intelligence."
(3) What is different about ChatGPT, in the kind of AI/ML and data being used, compared with the earlier and established ChatBots and Virtual Assistants?
What did you find interesting in the background history of development for the chatbot systems leading up to recent Generative AI in the LLM systems? (Note that OpenAI created the public interface for their GPT models with the name "ChatGPT" for the recognition factor.)

Week 9: Design Principles for LLMs, ChatGPT, and Generative AI Systems

Learning Objectives and Main Topics:

This week will be devoted to further deblackboxing LLMs and Generative AI by studying the design principles for OpenAI's NN Transformer models for ChatGPT and DALL-E.

The Transformer model (behind ChatGPT) extends and combines Neural Net "Computational Graph" developments by adding an "Attention" layer (calculations for the most significant patterns in strings of words) between inputs and outputs.

There are other NN Transformer models for text (by Google and Microsoft) and images (Stable Diffusion; Midjourney), but we will focus on the OpenAI developments because much more is known about the design of the generative models that are being implemented by Open AI.

Note on De-Blackboxing: As we go further into the black boxes of AI systems by following design principles, don't worry if the formal math and the versions of Neural Net graphs are beyond what you're prepared for understating now. It's important to know how the technical implementations are possible, and that the math and code "inside" the black boxes are not "black" -- just what most of us don't have a background in, but could learn if we wanted to. (Geoffrey Hinton, who just won the Nobel Prize, has been working on Neural Net models and algorithms for 40 years, so we can admire what it took to get us here today, and not get stressed or put off by the cumulative complexity in computing, math, programming, and Big Data that has made recent Generative AI possible!)

Video Lessons

These video lessons are incredibly well-produced with great illustrations, and are way ahead of anything we have in textbooks. They are a bit longer than the quick introductions, but you will learn so much by following each lesson as well as you can.
Art of the Problem: 30 Year History to ChatGPT and LLMs
Code Emporium [Playlist of series on NLP and Language Models. For this week:]
[These lessons are shorter overviews for what is explained in depth in the 3Brown1Blue videos.]
3Brown1Blue: [Playlist of series on Neural Net ML/AI. For this week:]
- What is a GPT: How Large Language Models Work, Visual Intro to Transformers.
- Attention in Transformers, Visually Explained

Primary Sources for the Development of LLMs and Generative AI

Graduate students should have access to the primary sources for the ideas in the fields that they are studying. You may not be able to use or reference these sources this week, but you can always return and use them in coming weeks and in your own research and learning.
Code Emporium (video lesson): 75 Years of Language Models
[This is a great summary of the research and theory that has gotten us to where we are today from the original papers published by the main founders of ML/AI. Deblackboxing also needs to include the background history of ideas and technical implementations that are assumed and combined in our systems today. Some of the main papers cited are in our Google Drive library folder for ML/AI.]
Background from OpenAI
- It was only two years ago (Nov. 30, 2022) that OpenAI released the first version of ChatGPT to the public (link to announcement) via a web interface.
- "Better language models and their implications." Introduction to the GPT (Generative Pre-Trained Transformer) model (for public release of GPT 2.0) (2019); described in detail in:
- OpenAI's research paper, "Language Models are Unsupervised Multitask Learners" (2019). This is considered a major turning point in Large Language Models. (Download and read the abstract.)
- Introducing ChatGPT (OpenAI's announcement, Nov. 30, 2022) & ChatGPT Plus
  [A lot has happened in development since Fall 2022!]
- OpenAI: All Research on LLMs (view to get an overview of what has been made public)

Writing assignment (Canvas Discussions Link)

For composing the first part of your discussion post this week, we will experiment with using the main generative AI platforms for explaining their "own" main design principles. We will compare the responses you received with what you studied in this week's context.
(1) Choose two the main platforms -- ChatGPT, Perplexity, Anthropic's Claude, or Google Gemini -- and try the following prompt:
“Act as a teacher in a college course for non-computer science majors, and explain the most important design principles of Large Language Models (LLMs) for generative AI. Include an explanation of the advantages of the Transformer model, which was presented in the paper “Attention is all You Need” (Vaswani, et al., 2017).”
(A prompt that begins by assigning a role provides better in-context results.) You can ask more detailed questions in a follow up. Try this:
“Explain in more detail the Neural Network models that are combined in LLMs and how these models are used in Generative AI for both text and images.”
Copy and paste the responses from the systems in your post. In class, we will discuss the responses that you got, whether they helped clarify any concepts for you (and whether the text generated seemed like repetitions of other sources), and questions that come from reading the responses in the context of this week.
(2) Your comments and discussion: What did you discover this week that helped you to understand the design principles for the processes in the “black boxes” of the Generative AI systems? Even though there are many levels of details and mathematical principles that make all the systems possible, could you follow the main concepts in the video lessons? Any “aha” moments? What questions do you have that we can discuss in class?

Week 10: Design Principles for AI Text-to-Image Generators (DALL-E)

Learning Objectives and Main Topics:

Deblackboxing the main top-level methods used in Image Generator applications. This week builds on what you learned about photo image data and the number maps for pixel regions in the grid matrix of a standardized digital image, and the basic design principles for Transformers in the Neural Net architectures.

Building generative image models: from ImageNet to DALL-E

There are many image generation models and public versions available in Web interfaces (MidJourney, Stable Diffusion/Stable Assistant, Google's Imagen) but we will focus on OpenAI's DALL-E for our main case study. Important learning goals:

Understanding the major design principles for Transformer Models that that are used for both LLMs and text-to-image Image Generator systems.
The text-to-image (as well as text-to-video, and text-to-music) systems are termed "multi-modal" ML/AI (meaning designed to handle, and integrate, more than one representational "mode" -- text, image, audio (includes speech/voice and music), video.
Methods unifying text and image Transformer models: Tokenizing in data set preparation and NN image generator models. Identified and recognized image "objects" are labeled patterns in photo images, which are tokenized into "patches" (contrastive regions of pixels = subcomponents of images). The labeled patches are assigned token index numbers in the dataset for computing the vectors (numbers for positions and relations in image embeddings) of the recurring subcomponents (typically normalized into n x n pixel arrays). Indexed tokens are combined in the generative decoding steps, and final software layers compose the components into an image file (typically .jpg) returned to prompt sender's interface.

Studying the research sources for this week:

I have linked some of the important research papers in the development of the text-to-image Generative AI models. You are not expected to read the papers from beginning to end (!). You can review the papers for the abstracts and main points, and the texts will be available for your own further research for final projects.

Readings and Video Lessons

For background on the text-to-image Generative AI models, see the Wikipedia article.
Welch Labs (video lesson): From AlexNet to ChatGPT and DALL-E with Transformers
- Excellent video. Remember: In generated images, we're seeing visualizations of numbers projected from probabilities. Neural Nets do not "see" or "read" images (just as LLMs do not "read" or "see" words). In the transformer models, calculations are done on tokenized regions of billions of images, which are assigned vectors (embeddings) in a method parallel with that used for LLMs.
- The research paper cited that advanced NN ML for image classification and generation (Alex Krizhevsky is the "Alex" in "AlexNet"):
  Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in Proceedings of the 25th International Conference on Neural Information Processing Systems (2012), 1097–1105. PDF here (link).
Assembly AI (video lesson): How does DALL-E work? [A useful intro, but not as technically detailed.]

OpenAI's DALL-E: Information from OpenAI and research papers published by OpenAI teams

Wikipedia background on DALL-E.
DALL-E: Creating Images from Text (announcement and description from Jan. 5, 2021).
- Note the References to the research history on which DALL-E was designed.
- OpenAI's CLIP (Contrastive Language-Image Pretraining) Image Labeling Process (Jan. 5, 2021).
- The published research paper introducing CLIP:
  Alec Radford et al., “Learning Transferable Visual Models From Natural Language Supervision” (arXiv, February 26, 2021), doi:10.48550/arXiv.2103.00020.
- About DALL-E 2 (2022).
OpenAI Introduces DALL-E 3 (scroll down to view the credits)
- DALL-E 3 is now built-in as a module in ChatGPT from version 4 on. The Image generator module is called in the system by the kind of prompts detected in the ChatGPT input box.
- See the research paper by OpenAI's developers on the advance in steps for image generation: James Betker, et al., "Improving Image Generation with Better Captions" (2023).
- Open AI: short video on using DALL-E for "inspiration" (lame).
- Open AI: short video on editing DALL-E Images in ChatGPT.

Background: Sources for the Data Sets Used in Training Image Generator Models

Data Sets used to train image object recognizers and the generators on the Transformer model:
The CIFAR 10 and 100 image datasets:Developed by Alex Krizhevsky, Vinod Nair, and Geoffrey Hinton at the Canadian Institute for Advanced Research (CIFAR).
- See Wikipedia background.
See Wikipedia's "List of datasets in computer vision and image processing."
- In the first list for "Object detection and recognition." scroll down and you will see ImageNet and the CIFAR-10 and -100 datasets, the most widely used in "benchmarking" tests of algorithms and Transformer models for images.
- Note: You will see these datasets mentioned in the research papers on ML models for images (over the past 20 years) and now on the generative Transformer models behind DALL-E, MidJourney, Stability.ai, etc.

Case Study: DALLE-E Image Generation and ChatGPT's Step-by-Step Explanation:
Preparation for Assignment

Read the documentation of my dialog with DALL-E and ChatGPT-4o on generating images from prompts in DALL-E, and then explanations of the DALL-E processes from ChatGPT. I did the same experiment with Stable Assistant (Stability.ai), which generated different kinds of images. Follow the image prompts and generated images, and also ChatGPT's detailed explanation of generative image processes in the DALL-E model.

For discussion at end of class:
Approaches for next week on using ML/AI tools for creativity, productivity, and research.

Writing assignment (Canvas Discussions Link)

Experiment with the DALL-E image generator and prompts for explaining the process like the strategy that I experimented with above. Follow these steps for your discussion post:
(1) Use your ChatGPT account with the direct DALL-E interface installed (it will appear in the left column of GPTs). Click on DALL-E, and open a new chat prompt. Write a prompt for a fairly complex and detailed image, and experiment with different amounts of specific details and styles to be included in an image. Use explicit words for genre and style and the contents (image "objects") to be generated. For example: the type or genre of image: a photograph in a period style? a HD (High Definition) science fiction type of image? a drawing, painting, or illustration in the style of a known artist? an imaginary visualization of people or places in the past or the future? a fantasy or dream-like image in a certain style with object contents that you list? (It's easier to compose and edit the image prompt in a doc first, then copy and paste it into the prompt box.)
(2) For one of the images generated, include the prompt language for explaining the image generation processes in DALL-E from the tokens in your prompt to the image that was generated.
(3) Copy and paste your prompt into your post with the image generated by DALL-E, and then ChatGPT's explanation of how the image was generated with DALL-E's internal processes. (If you have more than one image that you find interesting to discuss, include it also.) Discuss what you learned about the Transformer model for images and the correlation of your text tokens and the image tokens combined in your image. (Stylization is handled in output layers by software routines common to all image processing software like Photoshop). Ask questions to discuss in class. (There is a lot of complexity in the levels of processes, and much will always remain blackboxed in OpenAI's proprietary platform.)

Week 11: Using Generative AI Applications for Research, Productivity, and Creativity

Learning Objectives:

Our goal this week is to learn from the results of experimenting with Generative AI (Gen AI) platforms and tools as aids to productivity, research, and creativity. Students will try out their own test projects, and learn about the current state of capabilities of the platforms, and how to evaluate the results. Students will also experiment with modifying and designing prompts for specific platforms that can produce more useful results.

These projects will also be tests for use cases for current Gen AI systems and "tools" for non-trivial applications (that is, beyond shopping, product comparisons, ordinary business uses, novelty, and quick information searches). How can the platforms be used as aids for research, learning, and discovery? What are the limits/limitations of the Gen AI platforms and AI tools?

Experimenting with Gen AI Platforms for Research and Productivity

Instructions for Assignment with Links to AI Platform and Tools

Writing assignment (Canvas Discussions Link)

Present and discuss your experiments with the Gen AI applications as described in the instructions.

Week 12: Deblackboxing Methods for Explainable AI

Learning Objectives:

This week we will study a deblackboxing approach termed "explainable" and "interpretable" ML/AI. We will review some representative descriptions of the approach, current methods used and how they are evaluated and critiqued. You will find other related terms and categories of analysis in the research literature: transparency and white-boxing (the possibility of observing, i.e., visualizing, steps in operations, features in training data that lead to certain outputs -- the opposite of blackboxing) and accountability (whether designers, developers, and corporate owners of systems can give an account of how the systems make decisions, especially in high-stakes contexts like law, health care, employment decisions, and financial decisions). These topics overlap with the concept of "Responsible AI" and thus are also closely connected with Ethics and Policy issues, which we will continue next week.

The large debate on this topic, underway for several years, spans many contexts of discourse, and, unfortunately, many arguments fail to include definitions of "ML" and "AI" for which "explanations" can be given. As we will discuss further next week, we find that many descriptions about ML/AI systems are ideological and based on unquestioned assumptions. Making assertions and claims (no matter how strong) is not the same thing as having an argument and providing an explanation.

In your reading and discussion, we will maintain an important larger framework for questions:

"Explainable" or "Interpretable" from what point of view, by whom, for whom, in what contexts of use and application of ML/AI?
Keeping design principles in view: How do we pose the "why" questions for design principles, and at what technical and conceptual levels? How can deblackboxing the design principles implemented in various versions of ML/AI (for example, Transformers for LLMs and image generators, tokenizing for text and images) provide foundations for explanations?
What are the current methods used for "explaining" ML: tools for establishing trust in predictions, traceable processes, understanding ML model architectures.

For explanations and "explainability," we will also apply our deblackboxing methods by uncovering design principles for semiotic systems, and test how and whether emphasizing design principles can clear away a lot of false assumptions and misinformation now circulating about AI/ML (and "Big Data").

Readings and Video Lessons

Wikipedia article: Explainable AI (survey for main topics).
Jonathon Phillips et al., Four Principles of Explainable Artificial Intelligence (NIST Report, September, 2021).
- This is a short report written for the US National Institute of Standards (NIST). You can survey the main Principles for general background, and read sections that interest you. Note the extensive bibliography of references at the end (if you want to follow up for further research).
Video Lessons:
- Jay Alammar, “Explainable AI Cheat Sheet,” May 4, 2021 (GitHub intro)
  - Video: Explainable AI Cheat Sheet
  - Alammar has a good technical series of videos: "Explainable AI Guide" (if you want to go further)
- Data Odyssey: Interpretable and Explainable AI (Introduction)
- IBM: What is Explainable AI (Website) | IBM Video presentation: Explainable AI
  - This is a view for business contexts, and use of ML/AI products.
Research Papers (from the ML/AI technical community):
Review these papers for the abstracts, introductions, outlines of topics, and main issues.
- Arrieta, Alejandro Barredo, et al. “Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI.” Information Fusion 58 (June 1, 2020): 82–115.
  [This is an extensively researched article. Review the main points for how those in the field are approaching the question. Good starting point for your research project, if you are interested.]
- Haiyan Zhao et al., “Explainability for Large Language Models: A Survey” (arXiv, November 28, 2023). [Click on "View PDF" and download]
- Luca Longo et al., “Explainable Artificial Intelligence (XAI) 2.0: A Manifesto of Open Challenges and Interdisciplinary Research Directions,” Information Fusion 106 (June 1, 2024).
  - This article, by 19 international researchers in AI. is a true "state of the problem" for research and theory, but, of course, very densely packed and beyond our level for the course. But survey the contents and the model on p.6 for their "Manifesto" for XAI.
- A "Real-Time" View of Current Research and Theory:
  View index of all arxiv.org papers on "Explainable AI": (2,725 results as of 11.9.24)
  - Scroll through the papers sorted by most recent, most of which are in specialized areas. Do you notice one listed (by title and abstract) that seems more generally accessible for our study of ML/AI issues? Read the abstract and introduction: can you find the main questions, definitions, topics of debate, and challenges?

Writing assignment (Canvas Discussions Link)

This is a large topic with many open questions and applications in different contexts. Use 2-3 of the following question prompts to guide your thinking and and to capture your main take-aways:
From reviewing the background (in the readings and video lessons), summarize what you find to be the main issues, approaches, and challenges for "explainable" ML/AI? What have your learned from our deblackboxing methods (using design thinking, systems thinking, and semiotic systems thinking) that could be applied to the "explainability" questions? What kinds of "explainability" would be most valuable in different social contexts (business organizations in different sectors, schools and universities, law and government organizations) and for people with different levels of knowledge?
What approaches to "explainable" (interpretable) ML/AI do think would help if the approaches were more widely understood and practiced? What if the problems and concepts became part of our popular discourse, debates, and media coverage of AI/ML? Would some of the "explainable AI" approaches help in deblackboxing the larger issues about AI/ML systems in debates about ethics and "bias" (next week's topic)?

Week 13: Ethical, Social, and Policy Issues for ML/AI

Learning Objectives and Main Topics:

This unit will provide a top-level overview of the ethical, social policy, and governmental issues for the development and application of AI/ML in services, products, and online platforms.

The whole domain of ethics, policy, and regulation is such a huge area of concern now, and every week brings further news, red flag issues, and debate about Big Data and ML/AI ethics. Further, revelations about the environmental and energy costs of ML data centers are a major concern as the industry continues to scale up with seemingly unlimited needs for resources. This includes the massive amount of electrical power and material resources required for the processors. memory chips, and data infrastructure for the "compute" power required for preparing and running the ML/AI models and platforms.

Ethics must always start with the truth: The Importance of Deblackboxing

The starting point for ethics in relation to any technical system is a commitment to the truth about the technologies as designed systems; that is, truthful, de-blackboxed, open and accessible descriptions, definitions, and explanations (see the Floridi and Cowls reading below). Any definition or conception of an "ethics of/for AI" must begin with truthful definitions and descriptions of the systems. (That is, learning and understand the design principles of the technologies, the foundations in computation and data, how and why systems are designed the way they are, so we can communicate and explain first principles for informed public discourse.) The truths include understanding capabilities and limitations and correcting misinformation and myths. Having truthful stating points for any discussion of ethics or policy is why our deblackboxing methods for the design principles of ML/AI (in any version) are so important.

There is no "there" there outside human design, purposes, and intentions

ML/AI does not "have" ethics. "Machines" do not "have" ethics. People in communities, organizations, and institutions "have" ethics.

Main Topics in AI and Data Ethics and Policy

Transparency, Explainability, and De-Blackboxing of Closed Systems
- “Alignment” of design of AI systems with human intentions and values.
Accountability, Responsibility
- Who will be accountable and responsible for the security, reliability, and truthfulness of their systems, and mitigate harmful effects when discovered?
Trust and Reliability: Dealing with Truth, Falsehood, Lies, and Fakes
- AI system generated fakes and people using system for Intentional falsehoods and "fake" images and videos of political leaders and public figures.
- AI software for detecting “fake” and AI-generated text and images.
- Political, social, and legal consequences of "fakes" and intentional AI-generated misinformation.
Justice, Fairness, and Bias in Big Data and AI/ML Systems
- How are predictions and results generated by AI/ML models + training data (see Trust and Reliability)?
- “Bias” in Machine Learning: already critiqued facial recognition systems and stereotype image generation from datasets.
- Identity issues from under-representation of certain ethnic and gender examples in image datasets.
Governments and Governance in Democracies, Constitutional Rights and Laws
Employment, Jobs
- Will recent AI/ML platforms affect jobs and job categories, and/or will the new tech allow the creation of new kinds of jobs?
Freedom, Autonomy, Human Agency
Guaranteeing Beneficial and Non-Harmful Uses
Privacy and Security for Data Used in Datasets for ML/AI
Creative Rights, IP Rights, Ownership
- Generative AI based on large dataset models from harvested text and image data accessible on the Internet may infringe on rights of creators and owners of the source images and texts.
- Problem of aggregated data sources that necessarily reduce all data instances in training data sets (texts and images) to patterns of statistical tokens detached from creators or owners.
Environmental effects and energy costs for ML/AI Cloud Data Centers and material needed for processors, memory chips, and infrastructure.

Many of the issues in this list are not unique to ML/AI developments, and already existed for Big Data and Cloud computing architecture and the physical infrastructure of data centers. The major challenges for AI/ML ethics are political-economic: how can any collective decision about ethics and policy be put into practice in laws and regulation (national and international), as well as in agreements among tech stakeholders in various sectors.

Readings

The difficulty in defining the issues for ML/AI ethics: Wikipedia article: Ethics of Artificial Intelligence.
[This article is all over the place, and provides no references to critiques of assumptions.]
- See also: Wikipedia, AI Alignment. [Question of aligning ML/AI with human purposes, needs, and values.]

Introductions to Major Issues and Approaches

Luciano Floridi and Josh Cowls. “A Unified Framework of Five Principles for AI in Society.” Harvard Data Science Review 1/1 (June 23, 2019): 1-15. Pdf version.
Yoshua Bengio et al., “Managing Extreme AI Risks amid Rapid Progress,” Science 384, no. 6698 (May 24, 2024): 842–45. arxiv pdf.

Main Texts for This Week

We will use these two texts for summaries of the main issues from two vantage points:
Floridi is a major leader in research and theory for the philosophy and ethics of technology, who also knows the technical principles. Bengio, a major thought-leader in Deep Learning, is one of the three ACM "Turing Award" winners with Geoffrey Hinton and Yann LeCun. Bengio, for many years at the University of Montreal, has been leading a movement to inform policy makers and governments about the benefits and risks of AI. Download these texts and review the main topics in the contents and section headings. Choose one topic/issue to study more fully.
Luciano Floridi, The Ethics of Artificial Intelligence: Principles, Challenges, and Opportunities (Oxford; New York: Oxford University Press, 2023). [Download to review topics.]
Yoshua Bengio and et al., “International Scientific Report on the Safety of Advanced AI” (London: UK Secretariat, May 2024). [Also: pdf file on our Google Drive.]
- Background for the report on Bengio's personal site.

AI Ethics and Responsibility Organizations

Partnership on AI (all tech companies and academic research groups are members)
Algorithmic Justice League (working on fairness and equity in race and gender)
- Algorithmic Justice League, Facial Recognition Technologies: A Primer [pdf] and website research page link.
Stanford AI Index: 2024 Report. See section 3: Responsible AI
Future of Life Institute: Open Letter: Pause Giant AI Experiments
- Note the main signers of the letter. Parallel with the movement for explainable (or interpretable, transparent) AI, others are calling for a "pause" on the further development of massive "Big Data" models. Is this realistic?

Research Bibliography on AI Ethics (Irvine, ed., Google Doc)

E-Text Library for Further Research: Shared Drive: Ethics for AI & Data

Writing assignment (Canvas Discussions Link)

Use the the topic headings in Floridi, The Ethics of AI (Oxford, 2023), and the policy topics outlined in Bengio et al. (Science, 2024) and the expanded version in Bengio, et al., International Scientific Report on the Safety of Advanced AI (2024). From your learning so far, and from the overview of the topics and issues In this large field, what are you most interested in or concerned about? Choose one topic or issue to focus on (or 1-2 related), and write a short report with references for someone who needs a brief overview of the issues, why they are important, and what is being done to address the issues, and the challenges and complexities for implementing action or interventions on the issue. What do you think the next steps should be on the issue?
In your presentation, discuss how deblackboxing helps for providing a foundation for asking ethical questions. Provide some references (books, articles, relevant websites by ethics and policy organizations) for your readers to consult for following up. If you are interested, your research and presentation can be the starting point for your final project.

For Discussion in Class:

At the end of the class time, we will have an open discussion about your main learning achievements in the course, the "aha!" moments, and questions you have. We can also include an open discussion about ideas for your final project, and how to extend and apply what you've learned to a topic that you want to investigate further.

Week 14: Wrap-up and Discussion of Final Projects

Class Discussion:

Group discussion on your final project topics, ideas, and the current state of your research and thinking. Everyone will present their topics, approaches, and beginning research sources, and get feedback and suggestions from the class.

Post your main ideas, outline, and beginning bibliography in the Canvas Discussions for this week.

General Instructions for Final Capstone Projects (link)

Post Your Final Project Ideas for This Week's Discussion
(Canvas Discussions Link)

Due Date for Posting Your Final Project: