Compactly - a URL shortening service
July 27, 2021 -Check out Compactly at compactly.io.
Overview
Background
A common case study used in system design discussions and tehcnical interviews are URL shortening services such as TinyURL and Bitly. While the question of how to best scale applications such as these is open ended, the actual core functionality of such services is relatively simple - given a URL, generate a shorter and concise URL that redirects to the original website.
The simplicity of the fundamental requirements of this application make it a great project for reaching a number of learning outcomes:
- Gaining experience in developing and deploying a service from scratch.
- Explore avenues for scaling an application in a practical manner.
Usage
Once a user is registered, they will be required to verify their account via an email confirmation link. Following that they are able to head to the App section and start creating and managing their short links. As of now, there is a hard limit of 1000 short links per user.
Users can share their short links which when entered into the browser, will redirect to the original URL. For example https://www.compactly.io/NeAJzY5 redirects to this blog post!
Technical Discussion
Encoding the URLs
An important decision that is central to a URL shortening service is how the short link is generated. A common approach is to hash the original URL using a hashing algorithm, such as SHA256 or MD5, and then encode the result to display it. Furthermore, another decision critical to the problem is how many characters should our encoded hash be? The answer to all the above has to do with a compromise between brevity and the number possible unique encodings.
For example, let's say we use Base62 encoding. If we decide for our encoded hashes to be of length n, we can generate a maximum of 62n encodings. For small values of n, while we achieve brevity, we will most definitely run into collisions as the number of URLs encoded increases. On the other hand, if we make the value of n large, say 12, the collision probability approaches 0. However, this would make the supposedly shorter URLs too long, defeating the purpose of our application! Therefore a compromise must be reached. Setting n=7 is, I believe, a good compromise as this gives us brevity while keeping collision probability very low.
There are indeed ways to guarantee that no collision ever happens, such as utilizing a host that maintains a counter using a very large range of numbers, which increments when a new URL encoding has to be generated. The result would be the encoding of that number returned by the host. Even better, we could use a distributed systems manager, such as Zookeeper which can allocate ranges to a number of hosts. While these approaches have their advantages, they would increase the complexity of the project by a significant amount. For that reason, Compactly follows the hashing algorithm approach for the time being.
Another question we have to answer is how do we manage the scenario where two different users wish to encode the same URL? The easy answer would be to have a single entry in our database, and not care about where the redirects are coming from e.g., as a result of person A, or person B sharing the short link. This is rather limiting however, as it prevents us from utilizing metrics, such as a 'hit count' for each user's short link and we are also limited as to the control we give users over the short links they create, such as the ability for them to delete them. Instead, Compactly, combines the original URL with the unique identifier of the user, and carries out hashing later, resulting in unique short links even if a number of users wish to encode the same link.
Database Schema
The database schema for Compactly is rather simple. In addition to the tables generated for us by Identity, Compactly has a table for recording the urls and their encodings with the necessary columns:
- The original url.
- Its encoding.
- Date created.
- A field for recording the number of redirects.
- The unique id of the user that the short link belongs to.
Redirection
Upon entering a short link in a browser, the request will reach the Compactly servers, which will look through the database for a matching entry in the Urls table described in the previous section.
If found, Compactly will then issue an HTTP redirect, forwarding the request to the website indicated by the value in the original url column.
If not found, Compactly will simply redirect the user to its home page.
Technologies
I picked .NET 5 for developing this application for a number of reasons.
- I use .NET technologies in my day-to-day, and anything I gain from my personal projects will carry over.
- I wanted to gain experience in using Entity Framework, an object-relational mapper which enables developers to work with a database using .NET objects.
- I wanted to try out Identity for .NET, an API that provides a UI for login functionality, and manages users, passwords, roles and many more, right out of the box.
Regarding the database, I picked SQL Server as that arguably offers the most seamless developer experience when developing with .NET technologies. For similar reasons I also decided to use Azure as my web hosting service.
Closing Thoughts
This project was fun to develop and it was successful in achieving the learning outcomes I set out to achieve. With that being said, there are a lot of improvements that can be made, both in terms of usability and functionality for the users and in terms of improving scalability.
This project will be treated as a work-in-progress, and it will gradually be expanded, depending on the requirements for new functionality, future user-base and feedback.
Try Compactly out now at compactly.io and reach me at info@alexandros.io for feedback!