Netflix and Movies - Computer Science Project Report

1. Meeting the brief




2. Investigation


https://covid19ireland-geohive.hub.arcgis.com/ is one of the services that align well with the project brief. It appears to use a dataset related to Ireland’s COVID-19 cases, displaying various categories on the page like “Confirmed Headline Figures” and “Vaccination Data”. Additionally, it features a carousel system that allows users to navigate through different dates using left and right arrows. This design is both space-efficient and interactive, making it a strong consideration for my project. Further down the page, a tab system is used to separate different types of graphs. This could be a useful approach for my website, as it helps conserve both space and resources One dataset that particularly interested me and provided many ideas is the Kaggle Netflix Movies and TV Shows dataset (https://www.kaggle.com/datasets/shivamb/netflix-shows). It includes several valuable fields such as whether the show is a movie or TV show, the country of production, the date it was added to Netflix, and its original release date. I believe this dataset will be an excellent choice for my project. The dataset is provided in CSV format, which I will need to parse. Fortunately, Python includes a built-in CSV parsing library, eliminating the need for additional library installations that the user would need to install. I also need to consider accessibility by incorporating features such as adjustable font sizes and a simplified design. This will help prevent the page from feeling cluttered with excessive data and overwhelming the user. It should be a low physical effort for the user to be able to view the data so they should not need to go through a row of buttons to access specific data. A well-organized website with a clear heading and a simple layout will make information more accessible to users. One concern I had regarding the brief was ensuring the graphs are interactive and updated dynamically when values change or when the user changes some values. I found a library called Bokeh, which specializes in this functionality, so I will be exploring its potential for the project. If Bokeh is not suitable, Pygal another Python graphing Library, or Chart.js, a Javascript library, would be a strong alternative to consider. Since Bokeh generates index.html files, I should also explore using Flask to host these files and manage POST and GET requests. This will be especially important for handling user interaction with the dataset.


3. Plan and design


The brief requires us to develop an “interactive information system to display analytics on a chosen dataset.” I will structure my project into two main parts. My Backend and my Frontend. My Backend will be my Python Script running the website and holding all the data and my Frontend will be the actual website itself which the user can interact with. I will use Flask, a lightweight WSGI (Web Server Gateway Interface) web application framework, to develop the project. Flask allows me to use Python to handle different user requests, such as retrieving specific categories or movie details, and, respond appropriately with the data that the user requested. This will enhance interactivity and responsiveness, providing a seamless web-based interface for public access. In the backend, I will use Python’s built-in CSV library to format and extract relevant data from the Netflix Movies and TV Shows dataset. Additionally, I will use the JSON library to save movie suggestions provided by users in a JSON file. These suggestions will be submitted through the movie suggestion page using JavaScript forms on the front end and sent to the back end via a POST request. For the graphs themselves I will be using a Python Library Pygal to be able to create different kinds of visualizations, to include the necessary titles, axis labels, and legends, and also to allow users to interact with the data visualizations. The initial page will feature visualizations based on the dataset, including categories like "Type" and "Date Added." These visualizations will utilize different chart types, such as line graphs, pie charts, and treemaps. Additionally, each graph will feature accessibility options, including a full-screen button for a clearer view of the data. The goal is to present the information clearly and efficiently, ensuring that users do not feel confused or overwhelmed. A user can then go to a recommendations webpage and see a list of recommended movies and TV shows based on their selected movie or TV show and its associated genres. The recommendations will be categorized as High, Medium, or Low, providing the user with a range of options based on their selection. On the frontend, I will primarily use JavaScript to handle requests to the backend and display the graphs generated by the backend. I will use HTML and CSS to design the web pages, ensuring they are visually appealing, efficient, fast and also easy to navigate for users.

I will be using systems such as:


Flowchart of Project

Flowchart Diagram

4. Create


Week 01:
Week 02:
Week 03:
Week 04:
Week 05:
Week 06:
Week 07:
Week 08:
Week 09:
Week 10:
Week 11:
Week 12:

Psuedocode

      -> User enters website
      -> User selects option from dropdown on the category i.e “Type”
        -> Send Category using fetch to backend
          -> Backend checks for if category exists in the dataset
            -> If not 
              -> return 404 error code
            -> If Yes
              -> Gets occurrence of elements in the dataset
              -> Checks if there are more than 200 different occurrences 
                -> If Yes
                  -> Construct a svg saying “No Data”
                  -> Encode it in base64
                  -> Return the base64
                -> If Not
                  -> Create Style elements for pygal
                  -> Create different style variable instances
                  -> Assign titles and labels
                  -> Add data to each chart
                  -> Create list with render_data_uri for each graph
                  -> Return the URI’s
        -> Receives response from backend 
          -> Clear any embed or graphs from before
          -> Add Margin Top to 3rem on the display-box
          -> JSON Parse the data received and Loop:
          -> Create a container div element
          -> Assign Classname to div
          -> Create a embed Element
          -> Assign embed src to element of loop
          -> Specify that type of embed is svg/xml
          -> Create Button element
          -> Assign classname to Button
          -> Assign attribute onclick to button to allow fullscreen
          -> Assign InnerText to button
          -> Append Child of Button and embed to container div
          -> Append Child of container div to chart-container
      

Testing

During development but also after finishing the programming of the project I was able to test multiple scenarios both on the frontend and backend. One of the problems I came across while doing this project was that pygal graphs were not interactable and that happened when I tried to use pygal render_response feature which is supposed to be able to work with Flask but it didn't. Instead, I realized that It would work if I instead returned render_data_uri which would return a base64 URI that I could use with an embed tag which fixed the problem as it includes any libraries or scripts that the graph might need in the URI Itself.

Action Input(s) Expected Output(s) Actual Output(s) Test Result Test Data Used Comments
Validate acceptable Show_id (Suggestions) Any number between 1-8807 Accepts a valid show_id and continues Accepted a variety of show_id between 1-8807 and continued Pass 1, 123, 156, awe, 8807,0, 123213, 1e1 It also accepted 1e1 and defaulted it to 1 and for any invalid ones it disabled the submit button
Validate acceptable Rating (Suggestions) Any float between 1.0-5.0 Accepts a valid rating float and continues Accepted a valid rating float between 1.0-5.0 and continued Pass 0.0,0.1,1.0,5.0,1.0000, 1e1, 1e Any number that is like 1.0 or 5.0 rounds to 1 and 5 but also for any invalid ones it disabled the submit button
Validate show id (Recommendations) Any number between 1-8807 Accepts a valid show id and displays the movie selected and recommendations Accepted a variety of show id between 1-8807 and showed the movie selected and recommendations Pass jjj, false, 1,2,8807,1.0, 4.3, 1e1 Any non-number input and floats it alerted with an error on an alert box and for any numbers like 1e1 it returned No recommendations and movie selection as ID doesnt exist
Main Page function Choices from dropdown (Type, Country, Date_added, Release_year, Rating, Listed-In) Accept a valid choice and produce 3 graphs (bar, pie, treemap) and their respective full screen buttons. Accepted a valid choice and produce 3 graphs (bar, pie, treemap) and their respective full screen buttons. Pass Country, Date_added, Release_year, Rating, Listed-In, fwefwefew,123,4 On non-valid categories the backend displays a 404 and so no graphs are created

5. Evaluation


I found the final artifact quite impressive! I successfully included all aspects of both the Advanced and Basic requirements, although I had to adjust my initial plans to better align with them. Initially, I made a mistake by using Javascript to calculate the data and generate graphs, even though the brief specified: “Use Python to carry out analytics on your data and to create at least two different basic visualizations.” As a result, I had to switch from using a JavaScript library like Chart.js to a Python library like Pygal. However, despite this change, I was able to create a well-functioning and visually appealing project. I also incorporated features that I believe add significant value to a project like this, such as the search bar functionality. However, there are some improvements that could be made:


6. References



7. Summary word count


Section Word Count
1. Meeting the brief 0 (Video)
2. Investigation 401
3. Plan and design 441
4. Create 1107
5. Evaluation 304
Total: 2253