ECE101 Spring 2025 Exam 2

Study Guide

Question
1
(Lecture: Search Engines)

Explain in your own words how Search Engines work?

Sample Solution

Search engines work according to a five-step process:
◼
  • 1. Crawls the web to gather billions of documents (text, images, videos).
  • ◼
  • 2. Organizes the documents for fast searching, and places them in different sets by topic or keyword. This is also called indexing the documents (like the index in a textbook)
  • ◼
  • 3. Order documents in the each S by decreasing reputation, so that more important documents are ranked higher than less important ones.
  • These three occur before anyone does a search. Two more steps are performed when a user searches for a particular search phrase (for example P)
    ◼
  • 4. The search engine uses the phrase P to filter documents in S in order to find relevant set R.
  • ◼
  • 5.It does another round of reordering of the search results
  • ◼
  • based on knowledge of user (search history, YouTube preferences, travel, purchases, and so on)
  • ◼
  • to create the final list L to be displayed to the user (Often sponsored results or ads go at the top!).
    ​
  • Question
    2
    (Lecture: Search Engines)

    List the different types of information about the user that are typically available to the search engine?

    Sample Solution

    Some information available to search engines include:
    ◦ IP address (where are you?),
    ◦ browser,
    ◦ OS and computer
    ◦ search history
    ◦ more or less anything else that they can deduce by having figured out who you are and consulting their records on your preferences.

    Question
    3
    (Lecture: Search Engines)

    List the advantages and disadvantages of tracking by search engines from the perspective of the user

    Sample Solution

    Tracking can be really useful in situations like the following:• Weather tracking - e.g. tornado warning • Personalized location specific search results for restaurants, gas stations, stores and repair shops• Find deals and discounts you like• Find interesting websites that you may not know about based on your browsing history​Tracking can be compromising in situations like the following:• Lack of privacy: Weird ads popping up when screen-sharing with someone Search results and ads popping up on devices sharing same IP address (room mates and family members) Companies having access to your browsing history• Differing prices based on browser or OS• Sub-optimal search results tailored to please you even if the websites do not have the most reliable information

    Question
    4
    (Lecture: Search Engines)

    What are some ways legislation can help with protecting the privacy of internet service users?

    Sample Solution

    Government can pass legislation to enforce privacy regulation. An example is the General Data Protection Regulations or GDPR law passed by the European Union. Due to this law companies are being forced to
    ◦ securely process personal data they have collected securely
    ◦ collect only data than is necessary or relevant for the application
    ◦ make the data that they have collected on a person available to that person
    ◦ edit or delete the data when requested by the user

    Question
    5
    (Lecture: Search Engines)

    Explain the binary search process using any example you like.

    Sample Solution

    Remember an important requirement of being able to do binary search is that your data must be sorted--like the words in a dictionary are alphabetically sorter or contacts in an address book are alphabetically sorted. You can have items sorted according to any order, alphabetical or numeric or a custom order (as long as you are clear about the ordering).
    To find an item X in the sorted collection, you start in the middle and check if X is higher or lower in order than the middle item. If X is higher in order move to the second half of the collection, else move to the first half. Repeat the process (find the item in the middles and compare X with it) for every half till you find the X.

    Example: “Guess the Number” game.

    You ask you friend to think of a number between 1 and 500 and attempt to guess it in fewer than 10 attempts.
    Start at 250, ask your friend if their number is greater or less than 250. If greater then move to numbers 250-500, else move to 1-249. If greater than 250, ask if their number is greater or less than 375. In this way keep reducing your search space to half of what you had before, till you find the number.

    Question
    6
    (Lecture: Search Engines)

    1
    .
    Mention a few possible ways web pages could be ranked by a search engine.
    2
    .
    Explain in your own words the intuition behind “page rank” being a measure of importance of a web page.

    Sample Solution

    Ranking a web page

    Some possible ways we could rank web pages:
    1
    .
    “Reputation” of the company maintaining the webpage
    2
    .
    Age of the company maintaining a webpage
    3
    .
    Quality of content on the page
    4
    .
    Number of websites linking to a particular webpage
    5
    .
    “Page rank” of a web page-- a graph centrality measure computed for every node in a web graph created by web crawling. It gives a score based on the importance of a node in the graph.

    Intuition behind page rank

    The main intuition behind page rank is that a page is “good” if others point to it. While the number of incoming links to a webpage could be a practical measure of importance (incoming edges for a node in the web graph), links from more “important pages” should have more weight.
    When we count incoming links, we should also account for the “importance” of the pages from which the incoming link is coming. And we should account for the “importance” of the pages that link to those pages and so on as we move from node to node in the web graph.

    Question
    7
    (Lecture: Search Engines)

    Consider the example mentioned in class while explaining the intuition for Page Rank. There is a person at every node on the web graph. Each person chooses an outgoing link at random(equal probability, independently of past/future decisions) and walks to another node. The process is repeated many times, then stopped to count how many people are at a node to find the node’s “importance” (rank). The higher the number of people at a node, the greater the rank of the page. (Page Rank is the expected number of “people” at a node
    in a web graph).
    ​
    If at any point, a person ends up at a page with no outgoing link (some pages may have no hyperlinks at all on them), they can ‘start over’ by choosing a new node at random instead and go there.
    ​
    Using this example, which node in the following graph would have the highest page rank? Briefly explain why?
    In[]:=
    g=
    ;

    Sample Solution

    Good candidates for high page rank nodes are:
    americablog.org
    notgeniuses.com
    althippo.com
    ​
    However americablog.org has incoming links from other hub-like links like notgeniuses.com and althippo.com. So americablog.org could have the highest page rank.
    ​
    notgeniuses.com has a link from americablog.org so it may have a higher page rank than althippo.com.
    ​
    ajbenjaminjr.com might have a high page rank as well, on account of its incoming link from americablog.org
    Note: While there will be no questions related to Wolfram Language code in the exam, I am including the following results to help support the above explanation.
    In[]:=
    ReverseSortBy[{VertexList[g],PageRankCentrality[g,0.95]},Last]
    Out[]=
    {{americablog.org,0.215219},{notgeniuses.com,0.168615},{ajbenjaminjr.com,0.140354},{althippo.com,0.13021},{pandagon.net,0.0935381},{alicublog.com,0.0753501},{thharmonic.com,0.0680814},{leanleft.com,0.0485536},{agonist.org,0.0381248},{monkeystyping.com,0.0219554}}

    Question
    8
    (Recommendation Engines)

    List some companies in the present day that care about building efficient and successful recommendation engines. Explain how they might use the recommendation engines.

    Sample solution
    

    Question
    9
    (Recommendation Engines)

    Explain how the following data may be represented in feature space:
    ◼
  • Pokemon
  • ◼
  • Cities
  • ◼
  • Countries
  • ◼
  • Buildings on UIUC campus
  • ◼
  • Courses offered by ECE
  • ◼
  • Pixels in a digital image
  • ◼
  • Colors
  • ◼
  • Research publications
  • ◼
  • Political speeches
  • Sample Solution

    Question
    10
    (Recommendation Engines)

    Find a sample from the dataset that is closest to the listed candidate:

    1 dimensional

    The person closes in age to Alice who is 9 years old.

    2 dimensional

    Which city is closer to Champaign?

    3 Dimensional

    What are the three different types of techniques used for recommendation engines?
    Explain, with the help of an example, the difference between supervised learning and unsupervised learning.
    Explain, with the help of an example, the difference between the two common types of supervised machine learning: classification and regression.
    How can unsupervised learning like clustering be used when you need a classifier model to predict labels for new data, but do not have labeled training examples?
    What type of machine learning would you use in each of the following cases? Explain why?
    Your choices are:
    ◼
  • Supervised learning: Regression (How much or how many)
  • ◼
  • Supervised learning: Classification (Is this A or B)
  • ◼
  • Unsupervised learning: Clustering (How is the data organized? Are there groups and outliers?)
  • 1
    .
    Searching the App Store: search for an app and you see various similar and recommended apps
    2
    .
    Facebook:- You post a picture and FB can automatically tag people in the picture
    3
    .
    Gmail:- You are typing your email .... And gmail can automatically suggest the remainder of the sentence
    4
    .
    State Farm:- You are driving ... your smartphone in the car ... SF modulates your insurance rate based on your driving score
    5
    .
    Whole Foods: You go to the store ... sometimes only one of the eight check out lanes (each with two stations) are open ... sometimes all the stations on all the lanes are open ... sometimes anything in between. The store manager uses information like how many customers walked in through the entrance, time of day, day of week, month/season, events happening nearby to decide how many check out stations to keep open.
    6
    .
    Whole foods:- You go to a store ... pick items into your cart ... and just walk out without any checkout lines ... you are charged to your credit card.-
    7
    .
    Amazon Alexa:- Input: Hearing a voice command while a TV is on and other people are talking in the background ... output: decode the voice command-
    8
    .
    SpaceX air-taxi:- Input: Data that shows where people get into taxis and where they get off ... output: Launch pad locations for drone-taxis

    Sample solution

    1
    .
    Searching the App Store: Unsupervised - clustering
    2
    .
    Facebook:- - Supervised - Classification
    3
    .
    Gmail:-- Supervised - classification
    4
    .
    State Farm:- Supervised - Regression
    5
    .
    Whole foods:- Supervised - Regression
    6
    .
    Whole foods:- Supervised Classification (need to classify the items you bought - using images, weight of your cart, your movement through aisles, etc.)
    7
    .
    Amazon Alexa:- Supervised Classification
    8
    .
    SpaceX air-taxi:- Unsupervised - Clustering
    Define the following:
    ◼
  • Authentication
  • ◼
  • Encryption
  • ◼
  • Cryptography
  • Explain in your own words how cryptography can be used as a tool for informatics, business, finance, politics, human rights—any sector that deals with personal information or requires communication.
    Explain how each of the following works and also list an advantage or disadvantage
    ◼
  • Secret key cryptography
  • ◼
  • Public key cryptography
  • List a few examples of digital technologies used in home security
    How can computing be used to implement physical security?
    List some problems in using something like Machine Learning in real life applications
    List some advantages vs. disadvantages of using machine learning over humans.
    In class we discussed the following:
    ​Ethics: moral principles that govern ... the conducting of an activity.
    ​Privacy: the state or condition of being free from being observed or disturbed by other people.
    Describe a hypothetical scenario where use of data and machine learning models trained on the data, are used unethically and compromises user’s privacy.
    Offer a solution to the problem.

    Sample Solution

    What is the idea of “net neutrality”?
    Provide examples where
    ◼
  • You will trade your privacy for internet service benefits
  • ◼
  • You will not trade your privacy for internet service benefits
  • Sample Solution

    Match the words with the definitions ...
    Pay attention to the information on the last two slides on most lecture presentations--those are good candidates for this set of questions.