In this section we collect a variety of information related to security, culminating in a description of how to properly access or offer access to protected resources.
We start with a broad description of security principles. There are three key areas of concern when discussing security, commonly abbreviated as C-I-A:
Along with the aforementioned goals, we must also consider various attack methods. These fit broadly into four groups:
Group activity: Suppose that the "data" we want to protect is the contents of a specific piece of paper on the desk in your room. Consider the kinds of possible attacks on that data, based on the aforementioned four groups. Consider various approaches to "securing" this data against these attacks, and their tradeoffs/vulnerabilities regarding the above three concepts.
At the core of any security system is the ability to correctly identify and authenticate individuals attempting to access the system. This naturally breaks into two steps.
Group activity: Think of usages of your college ID around campus. Which of these usages only entail identification, and not authorization?
Group activity: Think of other practical examples of identification and authentication.
Group activity: Think of computer-related examples of identification and authentication.
One topic worthy of discussion is the difference between identity verification and authentication. In general, authentication is more secure. The difference can be seen in a bank giving someone access to their account. Simply presenting an ID to verify your identity would not be sufficient to authenticate you as an account holder. You will likely need to possibly know the account number or have a bank card or key.
Authentication methods naturally fall into categories, called factors. The standard factors are the following:
A common practice is the so-called multi-factor authentication, when we are using more than one factor to authenticate. A special case is the so-called two-factor authentication. For instance many online games now require you to download an "authenticator" to your phone, and when you want to log in to the game you need to both type your password (something you know) and type in the number shown on the authenticator (something you have).
Another example of this is using an ATM. You need to both have your debit card (something you have) and type in your PIN (something you know).
Group activity: Describe the authentication systems, if any, in place when using a credit card for a purchase, both physically at a store as well as online.
Group activity: Think of some single-factor systems and describe how we may turn them into multi-factor systems.
After identification and authentication have been performed, the next step is authorization, namely providing access to only those assets that the authenticated party is supposed to be able to access. In other words, authorization determines what the authenticated party can and cannot do. These controls can be physical (a guard, gate etc), or logical (electronic lock).
A key principle when considering authorization is the following:
Principle of least priviledge
We should only allow the bare minimum of access to an authorized party, in order for it to perform the needed functionality.
There are many examples of this principle. For instance the staff working in the registrar's office should not have direct access to your business office information, and conversely someone working in the business office should not be able to sign you up for classes.
The system that we have in place right now in the college does not follow this principle. It would be nice if you could give your parents limited access to the "billing" part of your account, so they could monitor it and pay and so on, but the only way to really do that right now is to give them your password, which also gives them access to your email, classes, grades etc.
Another example is a web server. The process that is running the web server should be given access only to the files it needs to do its job, and not the rest of the system. Oftentimes however the process acting as the web server is a superuser of the system. This could allow an intruder to exploit a web vulnerability and execute instructions as a superuser/administrator.
Most systems implement a separate user account, often called "apache" in the case of the Apache web server, and all webpages get that user's more limited permissions.
Many problems on personal computers are caused by the (default) user account also being the administrator account. This means that any application or program downloaded from the internet will be executed with full administrator priviledges on the computer, and would thus be capable to do a lot of damage. Therefore it is a common guideline to always create separate "user" and "administrator" accounts.
Authorization is described most often in terms of access control. Every use case can be described in terms of the following basic operations:
There are fundamentally two ways in which one can provide access control.
These are very common in many systems, from filesystems to web services. These list for any resource the identifiers of the parties that are allowed to access the resource, and what tasks they are allowed to perform. For instance you may protect a file against writing, and only allow it to be read. Also we would prevent users from being able to access each other's files.
Another example of ACLs is the firewall that we use at the college, which blocks access to many resources from out of campus, while allowing emails or web requests through. In this instance the "identifier" is the specific IP address of the incoming request and "port" that it comes at. Email and web requests use specific ports, so the firewall blocks all other ports.
These are less common, and their goal is to provide access based on a token rather than an identification of the individual. The holder of the token is given access, and thus for example a user can obtain a token for a specific action, then give that token to someone else. That other party can then use the token to obtain access to the resource. A good example of this is our college card system. If you give someone your card, then they can use it to for instance open a door, make photocopies, etc. But their access is mostly limited to those actions that the card enables. They still can't check your email for example, as they do not have your password.
Another example of this would be a prescription for a medicine. Anyone can pick up the medicine as long as they have the filled out prescription (and possibly an ID).
Group activity: Think of other authorization situations from your experience, and describe them according to the aforementioned groups.
When it comes to electronic security, Cryptography has given us a powerful set of tools. We will very briefly discuss some of these tools in this section. The full topic would go far beyond the scope of the course.
Let us clarify some terms used in a cryptographic system
A fundamental tenet of modern cryptographic systems is that everything else about the process is known to an attacker, except for the message and the secret key. The attacker gets to see the ciphertext and also has full knowledge of how the message and the secret key were processed to obtain that ciphertext. The attacker then wants to learn the message or the key, and if the system is secure then they should not be able to do so. The converse approach, trying to secure something by hiding the details, is called "security through obscurity", and is considered a bad and brittle approach.
The remarkable fact is that mathematics provides us with the means of producing these secure ciphers. Even though an attacker knows perfectly well what the ciphertext is and how it was obtained, they have no way at all to determine the message and key that gave rise to the ciphertext.
There are various kinds of ciphers. A first fundamental example is that of symmetric key ciphers, also known as private key ciphers. In this situation, there is a common key between the sender and the receiver, somehow agreed upon in advance. This same key is then used for both encryption and decryption.
These ciphers come in two flavors. The stream ciphers operate on one bit of the message at a time. The block ciphers on the other hand operate on blocks of bits, typically 64 bit, as a group. Most symmetric key ciphers in use nowadays are block ciphers.
Popular examples of symmetric key ciphers are DES, AES, RC5 and Blowfish.
The asymmetric key ciphers differ from symmetric key ciphers in that they use two keys. The sender is given a public key which they use to encrypt the message, while the receiver has a private key that they use to descrypt the ciphertext.
The advantage of these ciphers is that they do not need for the parties to have had any prior "conversation" (in the symmetric key case the two parties must already share a secret key). The sender simply asks the receiver for their public key, and uses it to send the message.
Asymmetric ciphers are often used during the "handshake" portion of a client-server interaction, to establish a common secret key that the two parties can use for further information exchange via a symmetric cipher. For instance the first time your computer (the client) tries to connect to your bank's web server, they might initiate such an exchange. It might go something like that:
The reason you may want to do that is that asymmetric ciphers are a lot slower than symmetric ciphers.
Asymmetric ciphers often rely on deep mathematical conjectures. For instance the RSA cipher relies on our belief that for two very large prime numbers, if we are only given their product then we cannot recover the numbers.
Some popular asymmetric ciphers are the RSA, which is used to implement the SSL protocol used for all secure web exchanges (e.g. HTTPS), ElGamal and Diffie-Hellman.
This system relies on a level of trust when the server sends its public key to the rest of the world. Certificates are used to keep this kind of information somewhat protected. But this is a more complicated topic.
It's worth pointing out that in the past many of these algorithms were considered state secrets. For instance a popular cipher called PGP was on its release considered a munition, and its creator who released it to the world spent considerable time being accused of arms trafficking violations.
Cryptographic Hash functions are another tool in our arsenal. They are key-less, in the sense that they do not employ a secret key, and are one-way, in the sense we will describe in a moment.
A cryptographic hash function (briefly a hash function) turns a message into a much shorter message digest or hash. The resulting hash has a number of useful properties:
NOTE: Hash functions are also used to construct hash tables and dictionaries in many programming languages. These functions do NOT have all the good properties described above.
Some popular hash functions are MD5, SHA2 and SHA3.
MD5 in particular is extensively used to validate file downloads. Though it is slowly supplanted by SHA in recent instances. A company that has released a file/executable may provide you with multiple mirrors/locations from which you can download it. But to prevent the possibility that someone might change that executable and corrupt your system they also provide you with the MD5 hash for the executable. You can then run this hash on the downloaded file before executing it, to make sure it matches what the company gave you.
Hash functions along with asymmetric key encryptions can be used to "digitally sign" a document. The server is here the one who wants to send a signed document to the client. The steps are as follows:
Digital Signatures thus assure the integrity of the message. If confidentiality is also needed, then we can add a symmetric encryption phase to ensure that as well, though it is important to get the details right and we will not discuss this further here.
Digital Certificates can be used to link a public key to a particular individual or company. Therefore when we need to obtain someone's key, we instead obtain their certificate instead of directly asking them. This prevents so-called man-in-the-middle attacks.
Digital Certificates are issued by a Certificate Authority (CA), which is a trusted party that digitally signs the certificates using their key. This way we only need to trust the certificate authority, rather than trust each individual or company to give us their true public keys. VeriSign is one such widely used CA.
There is a larger infrastructure that makes working with public keys effective. It is called a Public Key Infrastructure (PKI).
In this section we will discuss how to implement web-server authentication. The most standard system in place is a username-password system, with session information preserved. Typically this consists of the following:
We will discuss how such a system can be implemented, and various gotchas along the way. Here are some things to watch out for:
One first key tool is the use of HTTPS instead of HTTP. This adds a number of steps in the process:
While in some instances HTTPS is used only for the login page, it is preferable to instead use it throughout the site to better protect the session information.
Another key component is how passwords are being managed.
As most people's passwords tend to have small "entropy" (i.e. the number of different possibilities is not all that big), they are often subject to what dictionary attacks or rainbow table attacks. To combat this, we often use what is known as a salt.
The salt is a randomly generated sequence of fixed length (say 20 bytes) that we append to the password before hashing. We generate the salt when the user first creates their account, and we store it in the database alongside the username and hashed password. When the user tries to log in later, we recover the salt from the database, combine it with the password the user provided, compute the hash and compare it to the stored hash.
This adds a bit of randomness to each account. Even if two users have the exact same password, they will have different salts and therefore will produce different hashes.
Session aim to maintain user information across multiple HTTP(S) requests. This is what allows us to not have to log on every single page of a site. In order to achieve that, sessions are created.