Non-coders' guide to the web

These pages introduce some of the building blocks of the web, and should help you better "hack" web content and platforms to communicate more clearly online.

  1. How the Web Works: What everyone should know about algorithms, platforms, databases, search engines, and URLs to understand how information moves online.
  2. Creating Web Content: How to use basic web elements such as HTML and CSS, as well as everyday tools like your Gmail editor to create richer multimedia content.
  3. Remixing Online Content: How to seamlessly (and ethically) integrate images, text, and videos from others into your own online work. 

How the Web Works

Algorithms

According to Time Magazine, the average Facebook user has access to about 1,500 posts every day but only views 300. That means you may never see 80% of the content posted to your network, and Facebook decides what you will and will not see. Algorithms are sets of programmatic rules that help platforms like Facebook decide what information you'll see, and in what order. Algorithms are designed by humans, though, and reflect the assumptions and biases of their designers. Here are a few questions to help you consider how algorithms shape your access to information on sites like Google and Facebook:

How does my activity on this platform change the results?

Facebook tracks users' engagement with every post, and then reshapes your News Feed to show you content that you're most likely to interact with again. In other words, every time you like or comment on a post, or click through to view the content, you're telling Facebook: "show me more like this."

How does my current location change these results?

Google uses your location to determine which search results to show you. "For example, if you’re in Seattle, when you search for coffee shops, you'll see ones that are nearby" (via Google). This can be pretty handy when you're looking for coffee shops, but may cause problems or confusion when you're looking for scholarly information. Is an article from the UC Berkeley website necessarily more relevant to you than one from Harvard, for example?

Is the content you want to see the same as the content that the platform wants you to see?

One major stream of revenue for Facebook is from Pages. Owners of Pages usually have to sponsor posts to get them to show up in your News Feed, and unless you go out of your way to tell Facebook that you want to see posts from a specific Page in your feed, it's in Facebook's economic interest to force the Page to pay for placement. 

Does recent content get pushed to the top of your feed? Do more popular posts show up first? 

Most online platforms, including many news publications, privilege recent posts over older content. It's also increasingly common for news publications and others to use popularity (e.g., trending topics) as a metric for relevance. In scholarly research, however, neither recency nor popularity may work as helpful filters, and you might need to take extra steps (to search by a date range, for example) to counteract those defaults.

Can I tune the results?

Facebook allows you to adjust your News Feed preferences, by choosing which Friends and Pages you'd like to see first. Google also helps you view and control activity, including your search history, on your Google accounts. You can also turn off  location tracking in Google.

Activity:

Choose an online platform or app that sorts a lot of information into a stream or feed of some kind—something like Instagram, Yelp, or Buzzfeed—and try to answer the following questions, looking at their homepage, your feed, or the search results on the site.

  1. How is the content sorted?
  2. Is the information presented in a way that's related to actions you've taken in the past? Or your search history?
  3. Do they privilege newer content over old content?
  4. What role does popularity play in how it's sorted?
  5. What information does the platform want you to see? Why?

Hacking URLs

We tend to take URLs for granted, but modifying URLs can help you share or save effective links to search results from library databases, search engines, and many other dynamically-generated websites like Facebook.

Check out the URL diagram below for a Google Scholar search. Look at the end of the URL: the query string after the question mark, hl=en&q=text+mining, tells Google Scholar what to search for. In this case it will search Google Scholar for English language results for the keywords "text mining."


parts of a URL

 


  • https:// is the protocol. Be wary of submitting any information on an http site (the "s" stands for security).
  • The host is scholar.google.com, which can be further broken down by:
    • google.com, the domain and scholar, the subdomain. Subdomains aren't very well regulated, so beware of scam sites using subdomains like paypal.ripoff.com.
    • .com is the top-level domain here. Anyone can sign up for a .org or .com site, so those are essentially meaningless, but .edu sites are only available for accredited educational institutions and so may be more trustworthy for some kinds of research.

Activity:

  1. Can you change the search terms for the URL below to find information in the library about intersectionality and race instead of digital literacy?

    http://search.ebscohost.com/login.aspx?direct=true&bquery=(digital+AND+literacy)&type=0&site=eds-live
     

  2. URLs to commercial sites often include a lot of tracking information that helps companies to track your behavior online. Can you modify this Amazon URL to remove the tracking code and still find books by David Foster Wallace?

    https://www.amazon.com/Infinite-Jest-Novel-20th-Anniversary/dp/0316306053/ref=sr_1_3?ie=UTF8&qid=1498603075&sr=8-3&keywords=david+foster+wallace

Platforms

What's a Content Management System?

A content management system (CMS) is an application that helps you create and manage digital content. You don't need a CMS to create a website -- just power up your plain-text editor and create an .html file and you will have a webpage! But CMSes often assume the heavy lifting of website design and functionality, allowing you to create more complex websites and focus on content. They are also good at managing permission for multiple editors. CMSes typically have underlying databases that manage content and users. Often you will hear CMSes referred to as "web platforms." Examples include WordPress and Drupal.

What's web hosting?

A web hosting provider runs servers (aka big computers with special software) that processes requests from users and delivers content (your website!) to other computers. Running your own server is a pain in the butt, so most people turn to dedicated web hosting providers to do it for them -- managing and updating the servers, dealing with security issues, backing up data, and making sure everything stays up and running. Usually, if you want a website, you need to find someone to host it.

Things to Know

  • CMSes are often designed with certain kinds of uses in mind. For example, WordPress was designed as a blogging platform. Wix was designed to create simple but attractive basic websites. Tumblr is most suited to share image-centered content.
  • The code of some platforms is open source (e.g., Drupal). This means the code base is available to anyone for reuse and extension. The code of other platforms is proprietary, which means you won't have direct access to it (although many do provide APIs). There are pros and cons of choosing open sources or proprietary platforms, but understanding the implications can be important when choosing a platform.
  • Hosting isn't free. Whether with money, information, or eyeballs for ads, you have to pay for it. You'll need a web host to have content online; think carefully about the options, the company, the terms of use, and the amount of storage offered.
  • Have an exit strategy. There will come a point at which you'll want or have to change platforms or move your website. Companies go out of business, platforms are abandoned, designs age, or your website outgrows its home. Will it be easy to pack up and go when you have to?
  • CMSes and hosts are different things. You can create a website in WordPress and choose one of countless hosting providers to host it. However, they sometimes come packaged together, and sometimes they're only available to together, especially with proprietary platforms. For example, you can't download and install the code base for Wix and install it on the host of your choice. If you want to use Wix, you have to also use them as your website host.

Learn More

Upcoming workshops on these topics: 

Search Engines vs. Databases

We tend to use terms like search engine and database interchangeably, but they are actually very different kinds of tools. Understanding a little bit about how they each work can help you search more effectively.

  Search Engines Databases
Examples Google, Bing OskiCat, JSTOR, Netflix
What An index of content that you can search to find links to content all over the web. A discrete organized collection of resources.
Where The content is often not in the search engine itself, but all over the web. The content is sometimes available in the database itself (e.g., JSTOR).
Access Search engines usually point to content owned by others. For this reason, Google will sometimes ignore content that is not freely available online, or point you to sites (like Amazon) where you can buy content. Databases often own or license the content therein (e.g., Netflix), and therefore serve as a conduit to access premium content.
Search results Usually ranked according to perceived "relevance" to other users. More often organized according to various pre-defined principles (such as authors, titles, or subjects).
Ads Often contain advertisements and/or sponsored search results. Scholarly databases rarely include ads, but other databases very well might.

 

Activity:

  1. Go to JSTOR, a scholarly database you can access through the Library. 
  2. Try a keyword search on any research topic (e.g., feminist film theory, or ethanol sustainability)
  3. Now try the exact same keyword search in Google.
  4. How do the search results differ?
  5. What kinds of sources (journals, newspapers, websites) come up for each?
  6. How does the level of writing compare? Who are they written for?
  7. What kinds of authors created the content? [Hint: Google an author's name to find out more info]