101.school
CoursesAbout
Search...⌘K
Generate a course with AI...

    Python

    Receive aemail containing the next unit.
    • Refreshing Python Basics
      • 1.1Python Data Structures
      • 1.2Syntax and Semantics
      • 1.3Conditionals and Loops
    • Introduction to Object-Oriented Programming
      • 2.1Understanding Class and Objects
      • 2.2Design Patterns
      • 2.3Inheritance, Encapsulation, and Polymorphism
    • Python Libraries
      • 3.1Numpy and Matplotlib
      • 3.2Pandas and Seaborn
      • 3.3SciPy
    • Handling Files and Exception
      • 4.1Reading, writing and manipulating files
      • 4.2Introduction to Exceptions
      • 4.3Handling and raising Exceptions
    • Regular Expressions
      • 5.1Introduction to Regular Expressions
      • 5.2Python’s re module
      • 5.3Pattern Matching, Substitution, and Parsing
    • Databases and SQL
      • 6.1Introduction to Databases
      • 6.2Python and SQLite
      • 6.3Presentation of Data
    • Web Scraping with Python
      • 7.1Basics of HTML
      • 7.2Introduction to Beautiful Soup
      • 7.3Web Scraping Case Study
    • Python for Data Analysis
      • 8.1Data cleaning, Transformation, and Analysis using Pandas
      • 8.2Data visualization using Matplotlib and Seaborn
      • 8.3Real-world Data Analysis scenarios
    • Python for Machine Learning
      • 9.1Introduction to Machine Learning with Python
      • 9.2Scikit-learn basics
      • 9.3Supervised and Unsupervised Learning
    • Python for Deep Learning
      • 10.1Introduction to Neural Networks and TensorFlow
      • 10.2Deep Learning with Python
      • 10.3Real-world Deep Learning Applications
    • Advanced Python Concepts
      • 11.1Generators and Iterators
      • 11.2Decorators and Closures
      • 11.3Multithreading and Multiprocessing
    • Advanced Python Concepts
      • 12.1Generators and Iterators
      • 12.2Decorators and Closures
      • 12.3Multithreading and Multiprocessing
    • Python Project
      • 13.1Project Kick-off
      • 13.2Mentor Session
      • 13.3Project Presentation

    Web Scraping with Python

    Basics of HTML for Web Scraping

    family of markup languages for displaying information viewable in a web browser

    Family of markup languages for displaying information viewable in a web browser.

    HTML, or HyperText Markup Language, is the standard markup language used for creating web pages. It is a cornerstone technology of the World Wide Web and is essential for web scraping. This article will provide a comprehensive overview of the basics of HTML.

    Introduction to HTML

    HTML is used to describe the structure of web pages using markup. The elements of HTML are the building blocks of all websites. HTML allows images and objects to be embedded and can be used to create interactive forms. It also provides a means to create structured documents by denoting structural semantics for text such as headings, paragraphs, lists, links, quotes and other items.

    Understanding HTML Tags and Attributes

    HTML tags are the hidden keywords within a web page that define how your web browser must format and display the content. Most tags must have two parts, an opening and a closing part. For example, <html> is the opening tag and </html> is the closing tag. Note that the closing tag has the same text as the opening tag, but has an additional slash (/).

    HTML attributes are special words used inside the opening tag to control the element's behaviour. HTML attributes are a modifier of an HTML element type. An attribute either modifies the default functionality of an element type or provides functionality to certain element types unable to function correctly without them. For example, the href attribute in the <a> (anchor) tag is used to specify the URL of the page the link goes to.

    HTML Document Structure

    An HTML Document is mainly structured into head and body. The head element contains title and meta data of a web document. The body element contains the information that you want to display on a web page. Each HTML document begins with the declaration <!DOCTYPE html> to help the browser understand the document type and version.

    Introduction to CSS and JavaScript

    While HTML provides the structure, Cascading Style Sheets (CSS) are used to control presentation, formatting, and layout. CSS is used along with HTML to create beautiful websites. JavaScript, on the other hand, is a popular programming language that's used to create dynamic interactive content on websites.

    Understanding the basics of HTML is crucial for web scraping as it allows you to understand how the data is structured and how to access it. In the next unit, we will introduce Beautiful Soup, a Python library that is used for web scraping purposes to pull the data out of HTML and XML files.

    Test me
    Practical exercise
    Further reading

    Buenos dias, any questions for me?

    Sign in to chat
    Next up: Introduction to Beautiful Soup