How to create a YouTube Sports Highlight Platform

In this article, we will explore how to set up your own YouTube like sports website or application.

At some point, many of us have imagined building our very own video sharing platform. But turning that dream into reality is not as simple as it sounds. It requires technical knowledge, a solid infrastructure, and most importantly, access to video content.

One way to overcome the content problem is to integrate it with an external API that supplies sports highlight videos. In this article, we will guide you through the process of building a sports highlight platform that functions similarly to YouTube, but focused on sports content.

While we will discuss a setup designed for scalability, that does not mean you cannot get started with a smaller budget. All you really need is a domain, a server, a database, and some creativity to launch your own version of the sports video platform.

Architectural setup

If we take a closer look at the image from the article, we will notice several AWS components. Let us break down what each of them does.

Amazon 53

Amazon Route 53 is a scalable Domain Name System (DNS) service from AWS. Its primary job is to translate domain names like www.youtube.com into IP addresses so browsers can find and connect to the right servers. When a user enters your domain, Route 53 checks its records and routes the request to the appropriate resource regardless of whether it is an EC2 instance, load balancer, or even an external server.

The name Route 53 comes from port 53, the standard port used for DNS queries. But the service does much more than basic DNS resolution. It offers domain registration, health checks, and intelligent traffic routing based on latency, geography, or custom weights. This helps improve performance and availability for users across the globe.

Elastic Load Balancer

An EC2 Load Balancer refers to an Elastic Load Balancer (ELB) in AWS that distributes incoming network traffic across multiple EC2 instances. It helps ensure that no single instance becomes overwhelmed, improves application fault tolerance, and enhances overall performance.

When users make requests to your application, whether it's a website, API, or mobile backend, the load balancer acts as the entry point. It receives all incoming traffic and automatically spreads it across a pool of healthy EC2 instances. If one instance becomes unhealthy or crashes, the load balancer routes traffic to the remaining healthy ones without user disruption.

AWS offers three main types of load balancers under the ELB family:

Application Load Balancer (ALB): Best for HTTP/HTTPS traffic with advanced routing, like path-based or host-based rules.
Network Load Balancer (NLB): Handles TCP/UDP traffic at high performance and low latency, ideal for real-time systems.
Gateway Load Balancer: Used for deploying third-party virtual appliances like firewalls or packet inspection tools.

Each load balancer integrates with AWS services such as Auto Scaling, Route 53, and CloudWatch. This enables automatic scaling, intelligent routing, and real-time monitoring.

EC2 instances

An EC2 instance (Elastic Compute Cloud) is a virtual server in Amazon's cloud. It provides resizable compute capacity, allowing you to run applications just like you would on a physical machine, but without managing the underlying hardware. You can choose the amount of CPU, memory, storage, and networking power based on your workload.

EC2 instances are used to host web apps, APIs, databases, background jobs, game servers, and more. They support multiple operating systems, including various Linux distributions and Windows, and you can install any software you need.

To handle changes in demand, EC2 instances can be scaled in two main ways:

Vertical scaling which refers to resizing the instance type. For example, you can move from a t3.medium to a t3.large to get more CPU or RAM. This is a quick fix but has limits, since each instance type has a cap.
Horizontal scaling which refers to adding or removing multiple EC2 instances. This is more flexible and resilient. You use Auto Scaling Groups (ASGs) to automatically launch or terminate instances based on demand such as CPU usage or request rate. Combined with a load balancer, horizontal scaling ensures consistent performance during traffic spikes and cost efficiency during quiet periods.

Using both EC2 and Auto Scaling will ensure that your applications adapt to potential load spikes without manual intervention.

OpenSearch

OpenSearch is an open-source search and analytics engine, originally derived from Elasticsearch. It is designed for full-text search, log analytics, and real-time data exploration at scale. OpenSearch is maintained by AWS and the community, offering a fully open alternative after Elasticsearch changed its license in 2021.

It allows you to index, search, and analyze large volumes of data quickly. It is commonly used for log aggregation (e.g., from web servers, applications, or containers), monitoring, security analytics, and business intelligence dashboards.

One of its key features is the OpenSearch Dashboard (formerly Kibana), which lets you visualize data in real time graphs, charts, maps, and more. You can build custom dashboards for observability, performance tracking, or security insights.

OpenSearch integrates well with tools like Beats, Logstash, Fluentd, and Amazon CloudWatch, making it a powerful backend for monitoring infrastructure or application behavior. It supports REST APIs, structured and unstructured data, and complex queries across distributed systems.

In short, OpenSearch is used when you need fast, flexible, and scalable search and analytics on large or streaming datasets, especially in observability, operational intelligence, or search driven apps.

PostgreSQL RDS

PostgreSQL RDS refers to Amazon Relational Database Service (RDS) running PostgreSQL, a fully managed relational database in the AWS cloud. It combines the power and features of PostgreSQL, an open-source, enterprise-grade database system, with the convenience of automated management through AWS.

With PostgreSQL RDS, you get all the features of standard PostgreSQL such as ACID compliance, complex queries, JSON support, indexing, and extensions (such as PostGIS for geospatial data) without having to manage the underlying infrastructure.

AWS handles key operational tasks such as:

provisioning and scaling the database
automatic backups and point-in-time recovery
patching and minor version updates
monitoring with Amazon CloudWatch
high availability with Multi-AZ deployments
replication using read replicas for improved read performance

You can also configure RDS to scale storage and compute resources based on your workload, and integrate it with other AWS services like Lambda, EC2, and IAM for secure, scalable architecture.

In short, PostgreSQL RDS is used when you want the capabilities of PostgreSQL but do not want to worry about maintenance, uptime, or scaling manually. It is ideal for web apps, analytics platforms, APIs, or any system that needs a robust relational database.

Lambda

AWS Lambda is a serverless compute service that lets you run code without provisioning or managing servers. You simply upload your code, and Lambda takes care of everything else by scaling, running, and managing infrastructure automatically.

With Lambda, your code runs in response to events. These events can come from many sources: HTTP requests via API Gateway, file uploads to S3, database updates in DynamoDB, scheduled timers (cron jobs), or even changes in a Kinesis stream.

Each Lambda function is triggered when needed and runs in an isolated environment. You are only charged for the compute time your code actually uses, which is measured in milliseconds. This makes it highly cost-effective for workloads that do not need to run constantly.

Developers often use Lambda for:

serverless APIs,
data processing (e.g., image resizing, log filtering),
automation (e.g., triggering actions on S3 events),
backend tasks (e.g., sending emails, notifications).

You can write Lambda functions in several languages, including Python, Node.js, Java, and Go. Since it integrates tightly with other AWS services, Lambda plays a central role in building event-driven, scalable, and resilient cloud applications without worrying about server management.

EventBridge

AWS EventBridge is a serverless event bus that allows different AWS services, custom applications, and SaaS products to communicate through events. It is designed to help you build event-driven architectures, where components react to changes or actions rather than being tightly coupled.

EventBridge lets you capture events from a source (like an S3 file upload) and route them to a target, such as a Lambda function, SNS topic, or even another AWS service.

Each event is a small JSON message that describes what happened. EventBridge uses rules to filter and route these events to the correct destinations in near real-time.

You can use it to:

trigger automation in response to AWS service changes
connect microservices without hard wiring APIs
integrate with third-party SaaS tools like Zendesk or Datadog
replace polling with real-time event-driven flows

It is considered a foundational tool for building systems that are loosely coupled, scalable, and easier to evolve over time.

General flow

Now that we understand what each component does, let us walk through the general flow of the system.

First, we need to retrieve and store highlight data on a regular schedule. This task is handled by AWS EventBridge, which triggers an AWS Lambda function at defined intervals. To ensure timely coverage of sports highlights, we can define the schedule as follows:

Today: Retrieve highlights every 1–5 minutes to make new content available to users almost immediately.
Yesterday: Retrieve highlights every 1 hour to catch official uploads that may arrive with a delay.
Two days ago: Retrieve highlights every 4 hours as a safety net in case any late updates or corrections are made to existing content.

Once the event is triggered, the Lambda function queries the Highlightly API to fetch the latest highlight data. After retrieving the data, it proceeds to store the results in both PostgreSQL RDS and OpenSearch.

In PostgreSQL RDS, the highlights are upserted which meaning that if a record already exists, it is updated with the new data. This ensures any corrections or edits from the API provider are reflected in our database, keeping the data up to date.

Next, the data is indexed and inserted into OpenSearch. This step enhances search capabilities, allowing users to find relevant highlights even if their search terms do not exactly match the title of the video.

When a user visits the website or application and performs a search, the request is first routed through Route 53, which directs it to the Elastic Load Balancer (ELB). Based on the ELB configuration, the request is forwarded to an appropriate EC2 instance running the API server.

The API server handles the search query by first querying OpenSearch to identify the most relevant highlights. Then, using the retrieved highlight IDs, it queries PostgreSQL RDS to assemble the full response data. Finally, the results are returned to the user.

Enhancing the UX

To improve our website / application experience, we can do a bit more.

Geo restricting is a feature that allows content creators or rights holders to limit the availability of their videos to specific countries or regions. This means viewers in certain locations may not be able to watch a video due to geographic restrictions set by the uploader.

Highlightly identifies two common types of geo-restriction:

allowlist, whitelist: the video is only viewable in selected countries, and all other locations are blocked,
blocklist, blacklist: the video is viewable everywhere except in the countries specifically blocked,

Generally, geo-restrictions are often used for:

licensing agreements (e.g., content licensed only for certain regions),
regulatory compliance (e.g., blocking content in countries with strict media laws),
marketing strategies (e.g., releasing content regionally or at different times).

If a user tries to view a geo-restricted video, they will usually see a message like:

This video is not available in your country.

So to enhance the UX, we can query the geo restriction routes within the Highlightly API. If the current user is in a restricted country, we do not need to send them the highlight at all since they will not be able to view it anyway.

Main highlight page

For more in-depth details, please refer to the official documentation.

Highlights can come in different formats where some are embeddable, while others can only be viewed on the official website hosting the content.

At first glance, if a highlight includes an embedUrl, it might seem safe to assume it can be embedded. However, that is not always the case. Platforms like YouTube allow content owners to disable embedding. In such cases, the URL remains valid, but attempts to embed the video will fail.

To handle this properly, we can use Highlightly's geo-restriction route to determine whether the video is truly embeddable or not. This helps prevent broken players or confusing errors for users.

In the worst-case scenario when a video cannot be embedded, we can display a preview image instead. When the user clicks the play button, they are redirected to the third-party site where the video is hosted. If the video passes the embeddability check, we simply embed it without needing any additional logic.

This approach keeps the user experience smooth while ensuring we respect the content limitations imposed by the original host.

General page layout

Let's examine a potential page layout to show our highlights.

The above image consists of two main sections:

Highlight Player: This section displays the selected match clip. The highlight is either embedded directly into the website or, if embedding is not possible, redirects the user to the original hosting site when played. Below the video, you will find the title, upload date, and a brief description. A “Share now” link is also available on the right for quick sharing.
Relevant Highlights: Since we know which highlight the user is currently viewing, we can populate this section with related content. Ideally, these are other highlights from the same league, or, if that is not possible, from the same country. If you are tracking user behavior and have a recommendation system in place, you can also present personalized suggestions based on viewing history.

To enhance the user experience further, consider extending the interface with:

a comment section below the player,
a search bar above the player for easy navigation,
a side menu for quick searches or additional user actions.

Search results page

Another important page would be the relevant highlight search results page.

When a user enters a search query, the request is sent to your API server through the infrastructure described in the earlier sections. To avoid overloading the API with too many requests, we implement search debouncing. This means we wait 0.75 seconds after the last keystroke before sending a request. If a previous request has not been sent yet, we cancel it. Once the response is received, we use the data to autocomplete the user’s query.

When the user selects a search result, they are redirected to the highlight search page.

The page shown above is split into two main sections:

Left Side – Search Results: This area displays relevant results returned by the API. These can directly match the user’s query or include nearby suggestions, such as recent highlights from the same league or country. If a recommendation system is in place, you can enhance this further by showcasing trending videos related to the query. Each item includes a preview image, the highlight title, upload date, league name, and a short description.
Right Side – Related News: This section features relevant news articles. Since Highlightly provides per match news, we can display updates tied to the current highlight, offering users more context or background information. You API server will need to make an external query and store the data to your PostgreSQL RDS database.

As with the other pages, we can enhance the interface with:

a search bar above the results for quick filtering,
a side menu for shortcuts or additional user actions.

Copyright claims

As with all things on the internet, sports highlights can be copyright claimed.

Copyright claims refer to the legal ownership and control over the video content that captures various moments from sporting events such as goals, fouls, or game winning plays. These clips, though short and often shared widely online, are still considered copyrighted material, typically owned by sports leagues, broadcasters, or event organizers who hold exclusive rights to distribute and monetize the footage.

Before showing content on your website, ensure that you have updated your terms and agreements page with the following sections:

Copyright Disclaimer, e.g.: Our website may contain embedded video or media content from third-party platforms such as YouTube, Vimeo, or others. All intellectual property rights in such content remain with the original copyright holders. We do not claim ownership or authorship of this material.
DMCA Compliance, e.g.: We respect the intellectual property rights of others and comply with the provisions of the Digital Millennium Copyright Act (DMCA). If you believe that content appearing on our site infringes your copyright, please submit a written notification to our designated Copyright Agent with the following details:
- Your contact information,
- A description of the copyrighted work claimed to have been infringed,
- The exact URL or location of the allegedly infringing material
- A statement that you have a good faith belief that use is not authorized
- A statement that the information in your notice is accurate and that you are the copyright owner or authorized to act on their behalf
- Your physical or electronic signature
Limitation of Liability, e.g.: We are not liable for any copyright infringement caused by embedded content from third-party platforms. All claims or disputes related to such content must be directed to the respective content providers or platform.

When a copyright or DMCA claim is filed, the best course of action is to comply by removing the allegedly infringing highlight. It may seem overwhelming at first, but the process is usually straightforward. In most cases, the claim is triggered by an automated system or AI detecting copyrighted content, meaning you may occasionally encounter false positives.

Wrap up

In this blog, we aimed to present a conceptual overview of a high-scale architecture for building a sports highlight platform. Depending on your experience, technical skills, and project needs, you may choose to modify, replace, or omit some of the components we discussed.

That said, this general setup reflects what many of our customers use in their own applications and websites.

To wrap up, it is worth noting that the Highlightly API is not limited to YouTube style sports platforms. It can also be integrated into betting platforms, data analysis tools, or simple match showcase pages.

For more details and implementation guidance, visit: documentation.