Georgetown University
Graduate School of Art and Sciences
Communication, Culture & Technology Program

CCTP-607 Leading Ideas in Technology: AI to the Cloud
Professor Martin Irvine

Spring 2023

This course provides a conceptual and design-oriented introduction to the key concepts in computing and data technologies as now developed in AI and Cloud computing. (This course builds on the general methods in CCTP-820, Leading by Design, but it has a specific focus and can be taken independently). This course is especially designed for students from non-technical backgrounds, and provides methods for “deblackboxing” complex technologies through the key concepts and design principles approach.

The main learning objectives are gaining conceptual competencies and critical thinking skills for understanding, interpreting, and explaining the key design principles for (1) contemporary, networked computing systems, (2) kinds of data and data processes, (3) "artificial intelligence" (AI) and "machine learning" (ML), (4) Cloud Computing and "Big Data" systems, and (5) how these technologies are now combined and integrated. With the "design principles" approach, students will understand why certain technologies are designed the way that they currently are, and be able to distinguish between design and implementation, that is, the difference between product implementations and more universal design principles that make them possible.

Framework and Main Approaches

Every day, the news media, popular discourse, marketing, and advertising are full of statements about these technologies, but they are treated as unfathomable “black boxes” and corporate-branded products. To reverse this "blackboxing," this course will provide the methods, key concepts, and analytical tools for understanding the designs of the systems, devices, and interfaces that we use every day.

Our learning method comes from applying an interdisciplinary framework, which combines:

(1) “Systems Thinking” to understand how a specific technology is part of a larger, interrelated system (for example, computing systems, kinds of software, networks, and social contexts);

(2) “Design Thinking” for uncovering how and why certain technologies are designed the way they are, including the history of designs and the consequences of design choices;

(3) “Semiotic Thinking” for understanding these technologies as artefacts of human symbolic thought, and how we can delegate some human symbolic process to systems designed to automate them.

(4) the “Ethics and Policy” viewpoint for evaluating the social consequences of design choices in the way that technologies are implemented, and for analyzing the implications for ethics and governmental policy..


By the end of the course, students will have achieved (1) a conceptual, design-oriented understanding of computing systems, AI and ML, and Cloud systems, and (2) a competency in design thinking and systems thinking that can be applied to any sociotechnical system. These competencies will enable students to work with others in any career path to provide "deblackboxed," clear explanations of the key concepts and design principles of current and future technologies. Since the ability to communicate conceptually clear and truthful explanations of our technologies across technical and non-technical communities is greatly needed in every field and profession, students who have learned these competencies will be able to take on “thought leadership” roles in any career that they want to pursue.

View and download the pdf syllabus document: for a full description of the course, Georgetown Policies, and Georgetown Student Services.

Course Format

The course will be conducted as a seminar and requires each student’s direct participation in the learning objectives in each week’s class discussions. The course has a dedicated website designed by the professor with a detailed syllabus and links to weekly readings and assignments. Each syllabus unit is designed as a building block in the interdisciplinary learning path of the seminar. For each week, students will write a short essay with comments and questions about the readings and topics of the week (posted in the Canvas Discussions module). Students will also work in teams and groups on collaborative in-class projects and group presentations prepared before class meetings.

Students will participate in the course by through a suite of Web-based learning platforms and etext resources:

(1) A custom-designed Website created by the professor for the syllabus, links to readings, and weekly assignments: [this site].
(2) An e-text course library and access to shared Google Docs: most readings (and research resources) will be available in pdf format in a shared Google Drive folder prepared by the professor. Students will also create and contribute to shared, annotatable Google Docs for certain assignments and dialogue.
(3) The Canvas discussion platform for weekly assignments.


Grades will be based on:

  • Weekly short writing assignments (posted to the Canvas Discussions platform) and participation in class discussions (50%). Weekly writing must be posted at least 4 hours before each class so that students will have time to read each other's work before class for a better informed discussion in class.
  • A final research "capstone" project written as an essay or a creative application of concepts developed in the seminar (50%). Due date: one week after last day of class. Final projects will be posted as pdf documents in the Final Projects category in the Canvas Discussions platform.

Professor's Office Hours
To be announced. I will also be available most days before and after class meetings.

For all Georgetown Policies, Student Expectations, and Student Support Services:
consult and download the pdf syllabus document.

Books and Resources

This course will be based on an extensive online library of book chapters and articles in PDF format in a shared Google Drive folder (access only for enrolled students with GU ID). Most readings in each week's unit will be listed with links to pdf texts in the shared folder, or to other online resources in the GU Library.

Required Books:

  • Alpaydin, Ethem. Machine Learning: The New AI. Rev. ed. Cambridge, MA: MIT Press, 2021.
  • Peter J. Denning and Craig H. Martell. Great Principles of Computing. Cambridge, MA: The MIT Press, 2015. 

Recommended Books:

  • Cal Newport, Deep Work: Rules for Focused Success in a Distracted World. New York: Grand Central Publishing, 2016.
  • ———. Digital Minimalism: Choosing a Focused Life in a Noisy World. New York: Portfolio, 2019.
  • Gary Marcus and Ernest Davis. Rebooting AI: Building Artificial Intelligence We Can Trust. New York: Pantheon, 2019.
  • Stuart Russell, Human Compatible: Artificial Intelligence and the Problem of Control. New York: Viking, 2019.

Course Online Library (Google Drive: GU student login required)

University Resources

Using Research Tools for this Course (and beyond)

  • Required: Use Zotero for managing bibliography and data for references and footnotes.
    Directions and link to app, Georgetown Library (click open the "Zotero" tab).
    You can save, organize, export and copy and paste your references with formatted metadata into any writing project.

AI Research and Information Sources

News and Research Sources

Stanford AI 100: "100 Years of Artificial Intelligence": Main Site

AI, Ethics, and Human-Centered Design: University Research Centers

Professional Computing and AI Sources (ACM)

Orientation to Learning Goals of the Course:

  • Establishing some useful definitions, distinctions, and scope of subject matter: the key concepts and design principles approach.
  • Introducing our interdisciplinary framework that provides an integrated method for understanding all kinds of computational systems, and how to apply the method to today's computing systems, networks, AI, and Cloud systems.
  • Key terms and concepts: What are we talking about when we talk about "Computation," "Computing Systems," "Information," "Data," "Artificial Intelligence," "Machine Learning", "Cloud Computing"? Why clear and consistent vocabulary and concepts is so important.

Personal Introduction: My video introduction [Update: I've been at GU for over 30 years!]

Introductions and Interests

  • Who are we? Backgrounds and interests to be considered in developing the course.
  • Your professor's background and research interests: where I'm coming from.

Course Introduction: Requirements, Expectations, Orientation

  • Format of course, requirements, participation, weekly assignments, projects, outcomes (see above).
  • Using our Web-based syllabus, discussion platform (WordPress), online etext library (shared Google Drive).
    • Why I use custom-designed websites for courses: teaching philosophy, instructional design, student access to materials.
  • Classroom rules: how to use PCs and mobile devices: no social media or attention sinks during class.

Using Research Tools for this Course (and beyond)

Introduction to course methods and main topics

Examples for Discussion

Main Topics and Learning Objectives:

  • The learning goals of the course (see Introduction below).
  • Learning the interdisciplinary framework and main approaches in the course for "deblackboxing" computing systems, data, AI, Machine Learning, and Cloud systems.
  • The "Key Concepts" and "Design Principles" approach for understanding the major technologies: especially for students with no technical background.

Readings and Learning Sources

  • Prof. Irvine, Introduction to the Course: Our Framework, Methods, and Key Concepts.
    [Read Part 1, to p. 13, for this week first; you can complete Part 2 next week.]
    • Download, read, and print for reference. Don't worry if all this is new and hard to understand now: you can get it! (We will go over all the key concepts, step by step, in the course.)
  • Video Lessons: Introducing the Crash Course Series on AI and Computer Science
    • The "Crash Course" video lessons are very good for getting started on many topics. The video series on Computer Science and AI provide excellent introductions to technical descriptions; we will complete them with the conceptual framework for understand the "why" and "how" of the technical designs as we have them today.
    • Review the list of lessons for each series below; they are short and you can learn at your own pace. Here are some to get you started with background for the course:
    • List of lessons: Crash Course, Artificial Intelligence series.
    • List of lessons: Crash Course, Computer Science series.
      • For this week: view the Preview, then Lessons 3-5.
  • Peter J. Denning and Craig H. Martell, Great Principles of Computing. MIT Press, 2015 (in pdf). Read chapters 1-2 for this week. In Chap. 2, on the major "Domains" of computing, note the sections on "Artificial Intelligence," "Cloud Computing," and "Big Data."
    • We will refer to the computing principles outlined in this book throughout the course. Even though the book is for non-specialists, much may be new to you, and will take time for the terms and concepts to become yours to think with. That's normal and OK. Read it and re-read it as we progress through the course.
    • Use this book for getting clear, precise meanings of all the technical terms in computing and data science. These terms will be our "conceptual vocabulary."
  • Clarifying a Key Term and Concept: Artificial.
    What does "Artificial" mean as used in AI theory? Major source: Herbert Simon.
    • Herbert A Simon, The Sciences of the Artificial (MIT Press, 1996/2019) (excerpt). Read this short excerpt for useful meanings of the terms "artificial" and "symbol system" presented by a leading systems theorist and one of the early founders of AI.
    • "Artificial" (in our computing and AI context):
      Does not mean "fake" or inauthentic, like "artificial sweetener." Following Simon's clarifications, the term means (1) whatever is produced by imposing a human design for making an artefact; (2) everything in computing is an artefact (human made thing) based on human symbolic thought and the use of symbols (math, logic, programming languages) defined for a symbolic processing system; (3) some artefacts are designed as interfaces between observers and observed environments (which are different systems from those of the observers).

Examples for Discussion (in class): Steps in deblackboxing.

Prof. Irvine, Introduction: Topics and Key Concepts of the Course (Slides)

Writing assignment (Canvas Discussions Link)

  • Read the Instructions for the weekly writing assignment..
  • This week provides a "top level" overview of the topics and main approaches in the course. In the following weeks, we will study the key concepts and design principles that explain the "how" and "why" of our current technologies.
  • Your first discussion post this week can be informal. You can simply write some notes about your main "take aways" and questions from the readings (and video lessons). What questions came to mind about the main concepts and topics of the course? Can you see how you can think with any of the "macro-level" concepts and methods for the technologies that we will study? What would you like to have explained more in class?

Learning Objectives and Main Topics:

  • The key design principles of computing systems, and how AI/ML systems are designed for these systems.
    • Why learn this? The recent advances in AI/ML are all based on developments in the core technologies for computation (which includes processors, programming, database designs, memory, and data storage technologies). To understand how AI/ML systems are designed and how they work, students need to understand the basic architecture of digital computing systems from the design and conceptual viewpoint. This is an important learning step in our deblackboxing goals.
    • This unit is the first part of a sequence on the key design principles in computing systems, digital data, and the Internet as a computer networking system (continued in Weeks 4-5, 10-11).
  • Your learning goal for this week is to begin understanding the basic computing design principles, so that you can make the next learning steps for understanding how and why all our contemporary systems -- from digital media to AI and Cloud systems -- are built up in scalable layers or levels using many kinds of interconnected subsystems (lower levels systems serving the whole system). There is no magic, no mysteries -- only human designs for complex systems!
  • AI/ML in the field of computing and "data science":
    For people working in AI/ML, it's a large, interdisciplinary field that forms an ongoing research program, not a science of final conclusions. Designing AI/ML is a quest to use the tools in computational methods (keyed to the current state of our physical computer systems) to discover what can and cannot be automated for human interpretability of data (now in massive stored forms), and what kinds of analyses and interpretations can be delegated or assigned to the algorithms (processing models) programmed in running software. AI is not an "it," a "thing" or "object.". AI/ML represents many design communities and philosophies, many kinds of methods and clusters of technologies, and many attempts at implemented designs in computing and digital data systems. Don't confuse recent instances of commercial products (digital "assistants" with speech recognition, "machine translation," face recognition, consumer preference predictions/recommender systems, text and image generating "bots," etc.) with the underlying design principles in computation that enables implementations in a specific system (like OpenAI's ChatGPT and DALL-E)..

Readings and Video:
Fundamentals of Computer System Design Underlying AI and ML

  • Prof. Irvine, Course Introduction: Part 2 (pp. 14-21) for this week. (Download and print for reference.)
  • Prof. Irvine, (Video) "Introduction to Computer System Design" [in settings, switch to 720p or 1080p]
    • I made this video for CCT's Intro course (CCTP-505), but it is applicable for any introduction to the topic. I use some of the concepts in our course framework (in the "Course Introduction"), but we can fill in others.
    • You can also follow the presentation in a Google Slides version, if you want to review the topics at your own pace.
    • (Videos in my "Key Concepts in Technology" series may be useful as background for you.)
  • Video Lessons
  • Ethem Alpaydin, Machine Learning. Revised edition. MIT Press, 2021.
    • Read the Preface and Chap. 1 (to p.34). (We will consult this book later in the course.)
    • We are beginning the study of AI/ML in the context of computing so that you have a view of how computational processes in programs must be designed to perform the multi-level computations in AI/ML applications.
    • What is the distinction now made between "AI" and "Machine Learning" (ML)?
  • Stuart Russell and Peter Norvig, Artificial Intelligence: A Modern Approach. 4th ed., 2022. Survey the Table of Contents and survey Chap. 1.
    • This is the most widely used (and cited) textbook for AI/ML, now in an up-to-date edition with background on recent developments in "Neural Net" systems. You will get a glimpse of the deeper history and contexts that are behind the recent developments. It will all seem overwhelming, but with this overview you can at least appreciate all the work that has brought us to where we are today.

Examples for discussion in class.

Writing assignment (Canvas Discussions Link)

  • Continuing from last week, consider one or two of the computer system design principles (from the video lessons and readings) that help you "deblackbox" why modern computer systems embody these specific kinds of designs. Much will still be new and difficult, and you will have many questions (take notes to capture these).
  • Can you see ways that the frameworks in the Course Introduction help explain the "why" and "how" behind the technical descriptions (in the video lessons and readings)? Does working through the main principles and learning precise definitions of terms help you to "deblackbox" what seems closed and inaccessible? Express all the questions that come up for you that we can explain in class.

Learning Objectives and Topics:

In this week and next, students will learn the foundational concepts of "information" and "data" as defined in digital electronics, computing, communications, data processing, and data science.

The terms "information" and "data" are used in many ways, even within computer and "information" sciences, so we need to close off the whole "word fog" and focus on the precise technical meanings used in the context of digital binary information and computer systems. We can understand the technically precise meaning of "information," just like we can learn what "matter" means in physics, as opposed to many ordinary language uses of the word.

As all the introductions to computing and AI will tell us, all forms of computation and AI are based on "information [or "data"] processing," encoded in various levels of symbol representations (for data, code, algorithms, programs, etc.). Instead of just eliding over this topic (and using vague, ordinary discourse conceptions), we need to understand two central concepts at the core of the design principles for computation and AI/ML applications: what are the design principles for information and data, and how we can keep the technical meanings of the terms clear, distinct, and useful to think with in understanding all our complex systems.

Video Lessons and Readings (in this order):

Digital Data Case Study, 1: Digital Text Encoding for all Languages: Unicode

  • The Wikipedia overview of Unicode and digital character encoding is useful.
  • The Unicode Consortium Official Site (for Reference on the open source standards)
  • See: Unicode Character Code Charts for All Languages [Version 15.0]
    • Deblackboxing a data format used for most "alphanumeric" text and symbols:
      UTF-8 (Unicode Transformation Format - 8 Byte Units). See the definition of "UTF." UTF-8 is the most commonly used code format for European Latin alphabet-based languages, including the character encoding and graphical rendering of the Web page in your current screen display "window." (Unicode also has extended 16 and 32-bit byte units for other languages and scripts.)
    • For "Han" Unified CJK (Chinese, Japanese, and Korean) Ideographs and Scripts (pdf reference file). (There are many extensions of encodings for Asian scripts in the Code Chart.)
      • (On "Unified CJK" see Wikipedia.) Note that the Unicode practice is to decompose scripts and ideographic writing systems in the smallest meaningful "glyphs" (strokes, marks), which can be assigned a byte unit or code variation, which can then be composed (combined) in a Unicode code point definition.
    • Unicode solves the problem of data interoperability across technical platforms, operating systems, data typing (for "text" or "string" in programming languages), kinds of software, and device-specific graphics and screens. This encoding solution enables all kinds of text data analysis and NLP.
    • Unicode, "special" characters, and text in programming languages. In the table of the Character Code Charts, you will also find the Unicode "code points" (range of unique code numbers) for all kinds of mathematical and technical symbols. All programming languages are now designed to use a defined set of Unicode "code points" for the text characters and special symbols used in the code language. These encoded characters and symbols get rendered on screens (like ordinary text) when we write code (like in developer software designed for writing programs for Python, JavaScript, Java), and then what is saved to the digital file of the program text. Then, when a "compiler" or "interpreter" program translates the program code text (like a Python file) into pure binary "machine level code," the translation will always work reliably and consistently.
    • Important for AI/ML: Unicode encoded text is the data format of all text data analytics and Natural Language Processing (NLP) in AI/ML. Nothing in digital text would be universally "computable" without Unicode, because (1) now all text data is stored in Unicode format, (2) all software is designed to interpret Unicode byte-code data representations, and then "process" the data as "data type: text," whether in internal computations that we never observe (as in chains of ML processes) or in outputs of "alphanumeric" (or other Unicode-encoded) symbols to be rendered in screens through corresponding software layers.
  • Unicode Emoji (pictographic symbols) | Code Chart List | Go to the Full Emoji List (Version 15.0)
    • Yes! All emoji are encoded as "characters" and not "images." Emoji are encoded as Unicode bytecode (in a range of "code point" numbers) to be interpreted in software and then projected in any device's graphics processing and screen. Without a common standard for encoding as part of a text-encoding scheme, what we see as emojis wouldn't work consistently for all text contexts and software applications (like text messaging, social media, and email) in any device with its own internal software and graphics rendering system.
    • See the Current Unicode Full Emoji Chart with Modifiers (for skin tone and other modifications).
    • The Unicode "test file" of all currently defined "emojis" (v.15.0) (with byte code, symbol, and description)
      This is a text file with Emoji symbols encoded to test how they are interpreted and displayed with the software and graphics rendering of various systems. You may find some don't display in your Web browser or app, and the Emojis display differently from device to device. Why? (Think about each device's OS, graphics hardware and software, and type of screen.)]
    • Bytecode definitions expose systems levels: data encoding formats, device-software contexts, and screen rendering software for each device's OS and graphics hardware and software are treated as separate levels in system design.

Writing assignment (Canvas Discussions Link)

Referring to at least two of the readings or sources,

  • Describe the distinction between binary electronic information (as the substrate, subsystem, or physical medium for encoding data) and data as defined structures (patterns, sequences) of binary units that allow us to encode (digitally represent) types of human symbol systems in a software interpretation context. Is this distinction clear, does the Unicode example help?
  • Discuss your main "take-away" learning points about digital text as a data type, and the Unicode standard for encoding written characters and symbols in any language through internationally adopted "byte code" definitions. What questions do you have? What would like explained further in class.

Learning Objectives and Main Topics:

To advance in learning about how AI/ML techniques are applied to "data," students need to have a basic background on the specific meanings and concepts for "digital data" in the various domains of the "data sciences."

Understanding Data Types and Data Formats for AI/ML: Digital Images as Data

Last week you learned about the major data type: text (or "string") data, and how a standard encoding format (Unicode) enables a standard set of defined byte units to be interpretable across any kind of computer system and software programmed for text data.

This week, we will study digital images as as forms of data. What is a "digital photo"? How are photo images encoded as digital data? How can software be designed to interpret, change, and transform a digital photo image? What are the basics that everyone can understand?

Readings and Video Lessons (general introductions):

  • Prof. Irvine, "Introduction to Data Concepts, Data Types, and Digital Images."
  • Video lessons: Continuing with Data, Data Types, and File Formats
    Why and How computer systems are designed to "compute" with specific sequences of bits termed bytes, which give us the minimal units of data (a structured representation internal to the computing system). How data types for digitized audio and images are defined.
    • Crash Course Computer Science: View Lessons 15 (Data Structures) and 21 (File Systems). In Lesson 21, notice how the digital representation techniques for sound and images are different from text, and note how digital image data is structured in data file formats.

Digital Data Case Study, 2: Digital Images as Data

  • Ron White and Timothy Downs, How Digital Photography Works. 2nd ed. Indianapolis, IN: Que Publishing, 2007. Excerpt.
    • Well-illustrated presentation of the basics of the digital camera and the digital image. The digital image principles are the same for miniature cameras in smart phones.
    • Study pages on "how light becomes data," and "how images become data," and Chap. 9 on how software works with digital images. This background is important for understanding how AI/ML algorithms can be programmed to interpret and analyze image data.
  • Video Lessons:
    • Images, Pixels and RGB (by co-founder of Instagram)
    • From Branch Education: How do Smart Phone Cameras Work?
      • Good "deblackboxing" view, and excellent background on light and human vision. Sections may seem like too much technical detail, but there's a lot of useful knowledge here.
  • Optional: A little more technical background if you have time and want to go further:
    Video lessons from Computerphile:

Writing assignment (Canvas Discussions Link):

  • This assignment will help you understand what a digital photo is, and how a digital photo in a file format (.jpg, etc.) becomes computable data in software. There are 3 steps:
  • (1) Choose a photo image that you've taken -- or take a new one -- as a case to study. (If you have a digital camera, use that; otherwise, a mobile phone photo is OK). Upload the image file to a storage location that you can access, or send it to yourself, so that you can save it and work with it in software on a PC/Mac. Now you have a "data object" that is interpretable in software for displaying on a screen and "editing" with image software (for changing properties, size, etc.). Then, use any photo image viewer you have on your PC to view the image. Using the readings and lessons for this week, explain the process for how the image was created with the camera lens and sensor, and how the digitized light "information" registered in, and signaled from, the sensors behind the lens becomes encoded to form the 2D photo image recorded as an image file.
  • (2) Find out as much as you can about the properties of your image-as-data. If you know how to use Photoshop (or a similar digital photo editing program), you can display the "Information" in the "metadata" in the image file. (Most photos from digital cameras will have metadata in the "Exchangeable Image File Format," EXIF.) (Wikipedia background). Copy the metadata or do a screen shot of the file information for your post.
    • The basic photo software tools on our PCs/Macs don't provide a way to reveal the photo data. Online in-browser photo software usually won't have this function either (if you find one, let us know). Other digital photo options:
    • GU now has student licenses for the Adobe Creative Suite, which includes Photoshop and all the digital media production tools. Go to the GU Software WebStore to sign up for an Adobe account and download. Will take a little time, but worth it.]
    • Two excellent, free photo editing tools (download and learn how to use, as your time allows):
      • Irfanview (PC Windows) [This is great, "lightweight" software. My "go-to" when I don't need all of Photoshop.]
      • GIMP (Windows and Mac) [This a very full-featured, freeware version of most of what's in Photoshop. It will take a little longer to learn, but worth it if you're into photography, and don't want the whole Photoshop package.]
    • If you don't have time to use new software, these online sites provide image analysis (drag or upload your image as directed):
  • (3) Use any software tool for changing your photo image (cropping size or details, "filters" for colors or styles, morphing shapes, etc.) and explain how this works, as best you can from the background this week. Save two examples of changes to your image. Can you see how the software is designed to modify (transform) the basic math for the locations (x,y positions) and color values of areas in the 2D pattern of the digital image? (The properties of patterns in the whole pattern of the photo image is what AI/ML analysis starts with.)
  • In your discussion post, upload your image (and any edited or modified versions) at the beginning, and insert your discussion and analysis below the images. Have fun!

Learning Objectives and Main Topics

  • Goal: Learning the foundational principles for Pattern Recognition in AI/ML, and how basic pattern recognition techniques are used in digital image analysis.
  • Pattern Recognition (which includes Feature Detection) is the main problem domain of AI/ML: how can we design computational systems that "recognize" (can classify, categorize, or label) the regularities or invariants (= "patterns") in all kinds of data, but specifically in selected sets of data?
  • This main problem domain for application in all kinds of data has created a large multi- and inter-disciplinary field that combines computer science with philosophy, logic, mathematics, and cognitive psychology.
  • Detecting "features" and recognizing "patterns" in all the varying kinds of properties in data is also based on logical-mathematical models for establishing statistical invariance (getting better and more finely tuned approximations to properties of data that indicate patterns, what stays the same over millions of variable instances). We will never have 100% absolutely true or false feature detection and pattern recognition, but what most methods in AI and ML are designing for is getting "close enough" approximations on which decisions can be made and future states of things can be projected.

Readings and Video Introductions:

AI/ML application case study: Pattern Recognition Using Neural Networks

  • Andrej Karpathy, “What a Deep Neural Network Thinks About Your #selfie,” Andrej Karpathy Blog (blog), October 25, 2015,
    • Karpathy (now head of AI at Tesla) provides an "inside the blackbox" view of how "Convolutional Neural Networks" (the mathematical network graphs that can continually readjust the "weights" [values] between the nodes in the probabilistic calculations) can be used to do feature detection, pattern recognition, and probabilistic inferences/predications for classifying selfie photo images.
    • All "neural net" ML techniques are based on models in mathematical graphs and matrices for multiple layers of algorithms that can work in parallel and recursively (feeding outputs back in). This provides mathematical models for implementing multiple weighted statistical calculations over indexed features (the features identified, automatically or predefined). The "nodes" in the graphs represent a part of a statistical calculation with many interim results, which can be fed back into the program, and "fine tuned" for targeted output approximations. The layered graph models are encodable as linear algebra and other kinds of statistical formulas in the programming language used for the kind of data analysis being specified. All the abstract algorithms must be encoded in a runnable program.
    • By looking a little inside the "blackbox," you can also get a sense of the ethical implications of programming criteria (the parameters defined for selecting certain kinds of features and patterns) when the ML design is automatically "tuned" for certain kinds of results.

Writing assignment (Canvas Discussions Link)

  • This week will add further "building blocks" to understanding computing and AI/ML design principles through an introduction to pattern recognition by learning about an application in face-image analysis (Karpathy's article on "neural net" analysis of "selfies"). Study the background readings, and then discuss what you find to be the key points and issues in the Karpathy article. Apply as much as you can from our learning path so far.
  • Note: In your reading and discussion this week, don't be put off by the mathematical and technical models introduced in the readings. You will discover that the mathematical and logical models for the graph matrix algorithms (which are unfortunately termed "neural nets") are all in service of our cognitive abilities for recognizing and interpreting patterns, and then making projections (predictions) based on already "learned" patterns for analyzing new, future data.

Learning Objectives and Main Topics

Students will learn the basic principles of Natural Language Processing (NLP), which includes pattern recognition techniques for both text analysis and speech recognition in AI/ML methods. AI/ML applications use a suite of computational techniques for sorting, classifying, feature extraction, pattern recognition, translation, and text-to-speech and speech-to-text.

You will learning the basics of the linguistic and computational levels of analysis and processing involved in NLP implementations (text, speech, and both). We will focus on examples of machine translation (troublesome term) and speech recognition or speech processing.

Natural Language means the human languages acquired naturally by being born in a language community, as opposed to second-order or "artificial," "formal," "languages" developed for mathematics, sciences, computing, philosophy, and other fields. Formal "languages" are usually developed with specialized terms, notation, and symbols, and are termed "metalanguages," specialized "languages" about other languages.

NLP, as combination of linguistics and computer science, is built up with many kinds of formal "metalanguages" (from logic and mathematics) for defining, describing, and analyzing instances of natural language (spoken and written) as data. Computer programming "code" is a metalanguage for converting interpretations of data into data at another level.

Readings and Video Introductions:

Examples and Implementations

Case: Exposing Limitations of Neural Net ML Methods in NLP

  • OpenAI: New ML Language Model
    • This is the new RNN model that surprised everyone by how well the algorithms could generate well-formed "fake news" from millions of data samples of news writing.
    • Since ML is designed for pattern identification and recognition, the algorithms will provide recognized patterns (because the patterns are there -- in human-composed sentences), but the fact of a pattern has nothing to do with its meaning, the relation of a pattern to its use and its contexts of interpretation (truth values, consistency in logic, beliefs).
  • Will Knight, “An AI That Writes Convincing Prose Risks Mass-Producing Fake News,” MIT Technology Review, February 14, 2019.
  • Karen Hao, “The Technology Behind OpenAI’s Fiction-Writing, Fake-News-Spewing AI, Explained,” MIT Technology Review, February 16, 2019.
    • Note the continuing reification of "AI" as an entity in journalistic discourse.

Writing assignment (Canvas Discussions Link)

  • Using the key concepts and descriptions of the technologies in the background readings and videos, describe the design principles of one or more levels at work in one of the NLP applications above.

Learning Objectives and Main Topics:

  • What are the main design principles and implementations of AI systems in interaction interfaces for information, digital media, and Internet/Web services that we use everyday?
  • What do we find when we de-blackbox the algorithms, computing processes, and data systems used in "virtual personal assistants" (Siri, Google Assistant and Google speech queries, Alexa, Cortana, etc.)?
  • Deblackboxing the levels and layers in speech recognition/NLP applications (Siri, Alexa, etc.) by using the design principles method.

New: Online Learning Sources for AI/ML Technologies (compiled by Prof. Irvine)

  • Good lessons and sources for learning more at your own pace.

Continuing with OpenAI's GPT-3 NLP Generative Text "Transformer" System

Readings and Background on Applications

Virtual Assistant Speech Recognition Systems

  • Survey the implementations of "Virtual Assistant" speech recognition systems below, and choose one kind of system to focus on. Try to keep focused on the Computing/AI/ML design principles involved, and not on the products (which will be black boxes).

  • "Virtual Assistants" and Recommendation Systems: enhancement and/or surveillance?
  • Amazon Lex: Amazon's description of the Natural Language Understanding (NLU) service that Amazon uses (for Alexa and product searches) and also markets as an AI product for other companies on the Amazon Web Services (AWS) Cloud platform.
  • Google Assistant: Wikipedia background [mostly business and product information]
    • Google Assistant: Google Homepage [company info and app integrations]
    • Google's Patent Application for "The intelligent automated assistant system" (US Patent Office)
      [Scroll down to read the description of the "invention", and/or download the whole document.]
      [Patents are good sources for the design principles and a company's case for why their design (intellectual property) is distinctive and not based on "prior art" (already known and patented designs).]
    • Abstract of the system in the patent (same general description as for Siri):
      "The intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact."
    • Google Assistant for Developers (Google)
  • Apple Siri: Wikipedia background [note prior development of the system before Apple]
    • Apple's Patent Application for "An intelligent automated assistant system" (US Patent Office, 2011)
      [Scroll down to read the description of the "invention", and/or download the whole document.]
      [Note the block diagrams for the layers of the system.]
      [There are now many patent lawsuits going on for "prior art" in Siri and speech recognition systems.]
    • Abstract of the system in the patent (same general description as Google):
      "An intelligent automated assistant system engages with the user in an integrated, conversational manner using natural language dialog, and invokes external services when appropriate to obtain information or perform various actions. The system can be implemented using any of a number of different platforms, such as the web, email, smartphone, and the like, or any combination thereof. In one embodiment, the system is based on sets of interrelated domains and tasks, and employs additional functionally powered by external services with which the system can interact.
    • Apple's WIPO Patent, "Intelligent assistant for home automation" (2015).
    • Apple Machine Learning Journal (1/9, April 2018): "Personalized 'Hey Siri'."
    • Apple Machine Learning Journal: "Hey Siri..." [The design of Apple's "Neural Net" Speech Recognition system]
    • An inside view of how app developers can use Siri in other apps (Apple Developer).
    • Siri uses Wolfram Alpha for "knowledge base" (or "answwer engine") answers

Presentation (for discussion in class and study on your own):
De-blackboxing the Virtual Assistant, Speech Recognition/NLP and Database System

Critique of Virtual Assistants and Recommendation Systems

Writing assignment (Canvas Discussions Link)

  • Using the background from this week and the concepts that you've learned so far, choose one of the implementations of the speech recognition "virtual assistant" applications above and "de-blackbox" as many of the levels and layers in the design that make the system work in the way we observe it to work. Again, try to stay focused on uncovering the underlying design principles and implementations of NLP, and not on the product or company brand. You can apply what you've learned about AI/ML speech recognition/NLP systems for creating an interface "bridge" to the data and media services invoked behind the scenes. This discovery of levels and system modules will give a clearer picture of the complex system design.
  • The deblackboxing will at first be difficult because there is not much open source published information on the systems. So we need to use our conceptual design principles tools to "reverse engineer" how the systems must be designed to work they way they do. If you get stuck, you can use "reverse engineering" as a thought experiment (i.e., make a new version of a designed thing, not by using the plans of a branded product but creating a model of the required technologies that must be combined and managed to make it work): if we were going to design and build a service with a user-facing app like Siri or Alexa, what would it take to do it? Inside the black box: what unobservable (invisible) layers of technologies and design principles are required to explain what we do observe?

Learning Objectives and Main Topics:

This unit will provide a top-level overview of the ethical, legal, and government policy issues for the development and application of AI/ML in services, products, and "black boxed" applications.

The wide deployment of AI applications (ML, Deep Learning, Face Recognition, Speech Recognition systems) by businesses and governmental entities has provoked important ethical questions about AI and all uses of data. We will explore some of the main issues and learn via case studies of AI applications, both in everyday life and behind-the-scenes "invisible" technologies that effect privacy, security, surveillance, and human agency.

Frameworks for Study, Research, and Advocacy:

The starting point for ethics in relation to any technical system is a commitment to truthful, de-blackboxed, open and accessible descriptions, definitions, and explanations (see the Floridi and Cowls reading below). There is no magic, but much of the discourse for AI/ML is hype that keeps everything blackboxed and working like "magic" in the service of commercial (corporate) interests. Any definition or conception of an "ethics for AI" must begin with truthful definitions and descriptions. Here's where our three core methods for truthful de-blackboxing can come to the rescue:

  • The Complex Systems View: How is/can human agency for Ethics and Social Responsibility be included, modeled, or introduced in the design and implementation of AI/ML processes in the whole design complex systems from the beginning (not a "fix" or add-on when something goes wrong)?
  • The Design View of Complex Systems: How can the design view (and consequences for design) of AI/ML systems be communicated? AI/ML is always implemented in designs that must function in interdependent Socio-Technical Systems (composed of many Subsystems, Modules, and Levels [Layers]). Can the "Explainable" and "Intelligible" AI movement be the context for this communication?
  • The Semiotic Systems View: Computational and AI Systems are designed Semiotic-Cognitive Artefacts, for which human agency and collective human ownership needs to be clearly revealed, explained, and continually reclaimed. Explaining the underlying facts about computing systems must be part of the ethics of truth-telling about any computational technology. How can these truths be communicated so that more people can claim ownership over the designs and implementations of AI/ML, and become involved in collective agency for establishing an ethics of technology?

The Huge National and International Challenges

Though the above points are viable and well-understood directions in research and provide frameworks for debate, the major challenges for AI/ML ethics are political: how can any collective decision about ethics be put into practice in policy, regulation, and law (national and international).

Almost everything in computing (hardware, software) and data communications (Internet) is unregulated, and subject only to industry ecosystem standards and international standards agreements. Computing and data (digital media, information protocols, etc.) are global and international.

Technologies and specific versions of consumer products have no legal or judicial institution, except for intellectual property disputes. Software products, computing components, data communications hardware and software (Internet, wireless), and online transaction systems are based on many levels of international agreements and standards. How can AI/ML models, algorithms, and stored data used for commercial or governmental purposes be "controlled" or "regulated" at the level of ethics, policy, and law?

Video Introduction on AI, Ethics, and Society

Intro Readings and Backgrounds

Ethics, Policy, and Law: Industry, Corporate, and Governmental Issues

Cases: Ethics of Face Recognition Technologies and Use of Personal Data

For Further Discussion in Class: Other Applications being discussed for ethical issues:

  • "Bias" in machine learning algorithms used for business applications; personal data and privacy.
  • AI/ML and surveillance
  • Social Media business models: How can we analyze and critique the AI + data systems in business applications designed to maximize user attention, market personal data, and turn transactions (sales)?

E-Text Library for Further Research (Shared Drive: Data, Ethics, AI)

Writing assignment (Canvas Discussions Link)

  • There are many ethical, political, and ideological issues surrounding AI/ML applications. From the readings and examples of cases, identify what you think are 1 or 2 important issues and explain why. Use your deblackboxing skills to critique what is being discussed, and also to untangle and expose false, alarmist, or misunderstood ideas about AI, data, and computing systems.

Learning Objectives:

Learning the basic design principles and main architecture of Cloud Computing:

  • "Software as a Service" (SaaS)
  • "Platform as a Service" (Paas)
  • "Infrastructure as a Service" (Iaas)
  • "Virtualization" of server systems, scalable "on-demand" memory

"The Cloud" architecture: a model for integrating the "whole stack" of networked computing.

The design principles for Cloud computing systems extend the major principles of massively distributed, Web-deliverable computing services, databases, data analytics, and, now, AI/ML modules. Today, a simpler question for the ways we use the Web and Internet data might be "what isn't Cloud Computing"?

The term "Cloud" began as an intentional, "black box" metaphor in network engineering for the distributed network connections for the Internet and Ethernet (1960s-70s). The term was a way of removing the complexity of connections and operations (which can be any number of configured TCP/IP connection in routers and subnetworks) between end-to-end data links. Now the term applies to the many complex layers, levels, and modules designed into online data systems mostly at the server side. The whole "server side" is "virtualized" across hundred and thousands of fiber-optic linked physical computers, memory components, and software modules, all of which are designed to create an end product (what is delivered and viewed on screens and heard through audio outputs) that seems like a whole, unified package to "end users."

An Internet "Cloud" Diagram: What happens "inside" a Cloud is abstracted away from the connections to and from the Cloud: only the "outputs" and connections to the Cloud as a system need to be known.

Learning the design principles of "Cloud Computing" is an essential tool in our de-blackboxing strategy. Many of the computing systems that we are studying -- and use every day -- are now integrated on platforms (a systems architecture for data, communications, services, and transactions) designed for convergence (using combinatorial principles for making different systems and technologies interoperable) for data, information, and AI/ML data analytics. For organizations and business on the supply-side of information and commercial services, subscribing to a Cloud Service provides one bundle or suite of Web-deliverable services that can be custom-configured for any kind of software, database, or industry-standard platform (e.g., the IBM, Amazon AWS, and Google Cloud services).

Internet-based (or Internet-deliverable, Internet-distributed) computing continues to scale and extend to many kinds of online and interactive services. Many services we use every day are now managed in Cloud systems with an extensible "stack" architecture (levels/layers) all abstracted out of the way from "end users" (customers, consumers) -- email, consumer accounts and transactions (e.g., Amazon, eBay, Apple and Google Clouds for data and apps), media services (e.g., Netflix, YouTube, Spotify), and all kinds of file storage (Google app files) and platforms for Websites, blogs, and news and information.

Readings & Video Introductions (read/view in this order)

Major Cloud Service Providers: Main Business Sites

Writing assignment (Canvas Discussions Link):
choose one topic to focus your learning this week

  • Backgrounds for thinking and writing:
    • The Cloud system model provides ways to combine and integrate the "whole stack" of computing in the Internet/Web interactive client/server model. What was once modeled on distributed computing and processes in networked servers (considered as individual computers) is now virtualized in a design architecture across thousands of computers with high-speed connections that share processors, memory, and unlimited provision of storage, backup, and security. At any point of view (especially that of the end "user"), a Cloud system (like Amazon AWS and Google's Cloud) is a complex black box full of intentionally engineered subsystem black boxes that "abstract away" the complexity so that for a user/customer the screen+output "presentation layer" seems like a transparent, seamless, unified service. All the "back-end" processes and transactions are handled behind the scenes, and users only receive the results.
    • The Cloud architecture (although a well-known international standards-based model for system integration in layers and modules) is operationally available only through a subscription and build-out of services with an account on one of the major Cloud service provider companies.
  • (1) AI/ML modules and data service layers are now becoming a routine part of the Cloud "bundle" in a subscription package. Based on your background so far and this weeks readings, identify one or two main points of convergence in the design and use of AI/ML and Data systems implemented in the Cloud architecture ("Virtual Assistants," speech recognition, and Web/Mobile translation apps are all Cloud-based systems), and map out for yourself how the modules and layers/levels are designed for combination.
  • (2) Can you think through some of the consequences -- positive and negative, upside/downside -- in the convergence of the technologies on one overall "unifying" architecture (system design) provided only by one of the "big four" companies (Google, AWS, IBM, Microsoft)?

Learning Objectives and Main Topics:

In this unit, students will learn the basic principles at work in the converging "platforms" for data and databases, Internet/Web accessible data services, Cloud Computing, AI, and data analytics -- the whole bundle of which has become known as "Big Data."

"Big Data" (large quantities of stored data) is clearly connected to recent successes in ML applications. All "neural net" models are known to be "data hungry" -- that is, work best when calculating over millions of data "objects" for statistical maps of properties, features, and attributes that are the targets of data analysis.

Our current "data environment" is shaped by many kinds of technologies that are managed in multiple levels of interoperable Internet/Web-accessible data of all kinds. This includes the background layers of Internet and Web programming (for active and interactive networked client/server applications), AI/ML techniques for data, Cloud Computing for provisioning all levels of computing-and-networking-as-a-service (OS, platform, software, databases, real-time analytics, and memory storage), and Internet of Things (IoT) (IP-connected devices, sensors, and remote controllable actions).

This environment forms our biggest, complexly layered "black box", comprised of hundreds of subsystem and modular black boxes inside black boxes, all orchestrated for what we call "big data" functions. "Big data" just means massive amounts of data generated from multiple sources (human and human-designed computational agents) and stored in massive arrays of memory accessible to software processes at other levels in the whole system. The main dependencies of both AI/ML expansion and "big data" are cheap, modular memory, fast multiple core processors for parallel and concurrent processing, and fast ubiquitous IP-connected networks (both wired and wireless).

In studying a technology reified with a name (like "AI," or "Big Data") we must always immediately remove the name and consider the system of dependent technologies without which the top-level "black boxes" would be impossible. These are all complex systems, designed to scale with massive modularity and precise attention to levels of operation. Our access point to understanding this kind of system is always through opening up the design principles, and always keeping the human-community designs in focus, especially when confronting the torrents of marketing and ideological discourse from the business and technical communities for the technologies branded and marketed as products.


  • "Big Data": Wikipedia overview. [This is all over the place, but it provides a view of all the many contexts of Big Data, and how this concept overlaps with other data and AI/ML issues.]
  • "Big Data": ACM Ubiquity Symposium (2018). Read (html and pdf versions available):
  • Rob Kitchin, “Big Data, New Epistemologies and Paradigm Shifts,” Big Data & Society 1, no. 1 (January 1, 2014): 1-12. [Kitchin's summary of concerns discussed in the research in his book below.]
  • Rob Kitchin, The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences. London; Thousand Oaks, CA: SAGE Publications, 2014. Excerpts.
    [This is an excellent book for our focus on kinds of data, uses of data, and ethical consequences. You can survey the selected chapters to get a sense of Kitchin's approach. Also excellent bibliography of references if you want to follow up on issues in data ethics.]
  • Cathy O’Neil, Weapons of Math Destruction: How Big Data Increases Inequality and Threatens Democracy (New York: Crown, 2016). Selections.
    [This book has some valuable insights and important questions, but usually presented in a journalistic and anecdotal way (using an example to prove negative effects). The overall point is important, but "math" and "data" (as computational objects) aren't the problem. The full suite of ML models and methods (modeled in algorithms to be coded in a programming language), are designed to detect and establish patterns by closest approximation to invariants across unlimited amounts of data. The problem is faulty, incorrect, non-representative, and/or incomplete data samples over which the pattern recognizers are "trained" making predictions and decisions about new data to be analyzed. It comes down to basic statistical methods, the final outputs of which are based on inductive logic for the quantities of sampled data, and if the data being used for "normalized" patterns is wrong, then the AI/ML processes can lead to false correlations that, when used in decisions for hiring or financial credit, will continue to reinforce inequality.]

Writing assignment (Canvas Discussions Link)

  • "Big Data" continues to be vague term, often used in marketing and journalistic hype. From what you have learned in the course, how would you explain the key concepts in "Big Data" and "data science" as applied to an implementation in a process that we use every day (review the readings from Week 6 in this context).

Learning Objectives:

This class meeting will be devoted to synthesizing (drawing together, unifying) what we have learned about the design principles and key concepts for AI/ML, and Information and Data, and underlying computational methods. In our review and "summing up," we will want to reflect on the main philosophical and conceptual foundations of AI/ML, Information, and Data, and the social and ethical consequences of of current implementations of these technologies, now and for the future.

Readings and Video:

  • Film Documentary, Coded Bias (Dir. Shalini Kantayya, 2020). Available on Netflix, and free to view on the PBS Independent Lens site (until 4/21/21/).
  • This documentary has received a lot of attention, and the face recognition issues are very important. You will find that the movie production is full of confusing talk about "algorithms," and there is little or no explanation of ML and the mathematical networks used for the statistical modeling, and only one mention of how training data sets can be improved and corrected (good people in the field are already working on this, but this work is not mentioned).
  • We can also use this documentary as a case for applying what you have learned for critique and explanations. You should now be able to help "deblackbox" the talk abut AI and explain key truths for others: (1) all the tech is based on computational design and openly understood (or learnably understandable) design principles, (2) because AI/ML/Big Data are designed systems, communities of people with responsibility for these systems can intervene and redesign for truer outcomes, and (3) we all have an ownership stake in these technologies, not only because they are active behind every computational, networked device, software, or data service that we use, but because the symbolic functions themselves -- the math, logic, and kinds of data being interpreted (language, images, video, music, personal data, etc.) -- belong to all of us as human cognitive agents in communities with others. We all "own" this. Our human identities are based on shared, collective symbolic systems for communication, information, expression, learning, and knowledge, and this includes all the logic and mathematical operations that go with our shared symbolic systems (all of which preceded our digital era).

Critique and Analysis of Current Descriptions of AI/ML

  • Zachary C. Lipton and Jacob Steinhardt, “Troubling Trends in Machine Learning Scholarship,” ArXiv:1807.03341 [Cs, Stat], July 9, 2018. Presented at ICML 2018: The Debates.
    • This is a very enlightening article. The authors present analyses of some of the rhetorical mistakes and a critique of discourse used in describing current work in AI/ML -- from an insider's view.
    • The analyses are also very relevant for anyone wanting to understand what is going on in this field, develop clear and truthful explanations, and critique non-explanations and mystification.
    • The authors are part the Machine Intelligence Research Institute (MIRI) at Berkeley. [View the site to see the kind of research going on at MIRI.]

Current Issue in Computer Science Ethics: "Explainable / Interpretable AI and ML"

    • ACM Video: Techniques for Interpretable Machine Learning
    • This short video introduces one approach that communities in computer science are taking to deblackbox AI/ML. You can see that "explainability" and "interpretability" depend on communicating the truth about computer systems and algorithm design. You will learn more in a few weeks how using complex ML layered networks (graphs of combined computations) can generate unpredictable results that require further redesign.

Current Issues and Promising Directions: Combining Ethics, Design, and Policy

Confirm Your Own Learning Achievements in the Course!

  • Re-read the "Introduction to Key Concepts, Background, and Approaches" for the course.
  • (I promised that by following our learning path, step by step, you would be able to understand these key concepts in computing and AI. If you need help filling in what isn't clear yet, ask in class.)

In-class discussion: Developing plans for Final Projects

Writing assignment (Canvas Discussions Link)

  • Use this week to review and reflect back on the main concepts and methods of the course, and think about how you could develop your own synthesis (ways of combining ideas, approaches, models, and methods) that could lead toward an approach for your final project. Use the readings and approaches to ethical and social concerns from last week and this week, if these ideas provide a way to reflect on the topics of the course in a unified way.
  • For your post, you can pose questions about a topic, method, or key concept that we have covered, and that you would like to know more about and follow up with further research. You can also discuss any "aha!" moments that you experienced when reflecting back on what you have learned in the course, and connections between concepts and/or principles of technologies that you've discovered this week.

Learning Objective:

  • Developing final research projects.

Beginning Notes, Outlines, and Bibliography for your Research

General Final Project Instructions

  • Post any notes or bibliography references that you are working with (if ready) to discuss with your professor.

Class Discussion:

Group discussion on your final project topics, ideas, and current state of your research and writing.

  • (Canvas Discussions)
  • Final Project Research Essays by students in 607: Spring 2019 | Spring 2020

  • Deadline: The target deadline for posting your essay is one week after the last day of class. Extensions are possible if you write to the professor to explain. We will be flexible in this unusual time so that you will be able to do good work, and finish the semester well. (But we all need deadlines!)
  • Using your written project after the course: You can post your final "capstone" project as part of your "digital profile" wherever it can be useful to you (in a resume, LinkedIn, social media, internship applications, job applications, and applications for further graduate studies).