You have been tasked with building a URL file validator for a web crawler. A web crawler is an application that fetches a web page, extracts the URLs present in that page, and then recursively fetches new pages using the extracted URLs. The end goal of a web crawler is to collect text data, images, or other resources present in order to validate resource URLs or hyperlinks on a page. URL validators can be useful to validate if the extracted URL is a valid resource to fetch. In this scenario, you will build a URL validator that checks for supported protocols and file types.
What you need to do?
1. Writing detailed comments and docstrings
2. Organizing and structuring code for readability
3. URL = :///
Steps for Completion
Task
Create two lists of strings - one list for Protocol called valid_protocols, and one list for storing File extension called valid_ftleinfo . For this take the protocol list should be restricted to http , https and ftp. The file extension list should be hrl. and docx CSV.
Split an input named url, and then use the first element to see whether the protocol of the URL is in valid_protocols. Similarly, check whether the URL contains a valid file_info.
Task
Write the conditions to return a Boolean value of True if the URL is valid, and False if either the Protocol or the File extension is not valid.
main.py х +
1 def validate_url(url):
2 *****Validates the given url passed as string.
3
4 Arguments:
5 url --- String, A valid url should be of form :///
6
7 Protocol = [http, https, ftp]
8 Hostname = string
9 Fileinfo = [.html, .csv, .docx]
10 ***
11 # your code starts here.
12
13
14
15 return # return True if url is valid else False
16
17
18 if
19 name _main__': url input("Enter an Url: ")
20 print(validate_url(url))
21
22
23
24
25

Answers

Answer 1

Answer:

Python Code:

def validate_url(url):

#Creating the list of valid protocols and file name extensions

valid_protocols = ['http', 'https', 'ftp']

valid_fileinfo = ['.html', '.csv', '.docx']

#splitting the url into two parts

url_split = url.split('://')

isProtocolValid = False

isFileValid = False

#iterating over the valid protocols and file names for validity

for x in valid_protocols:

if x in url_split[0]:

isProtocolValid = True

break

for x in valid_fileinfo:

if x in url_split[1]:

isFileValid = True

break

#Returning the result if the URL has both valid protocol and file extension

return (isProtocolValid and isFileValid)

url = input("Enter an URL: ")

print(validate_url(url))

Explanation:

The image of the output code is attached. Hope it helps.

You Have Been Tasked With Building A URL File Validator For A Web Crawler. A Web Crawler Is An Application
You Have Been Tasked With Building A URL File Validator For A Web Crawler. A Web Crawler Is An Application

Related Questions

A ___________ is a variable used to pass information to a method.

Answers

Answer:

A parameter is a variable used to pass information to a method.

Explanation:

A parameter is a variable used to pass information to a method.

What are the features of parameter?

In general, a parameter "beside, subsidiary" is any quality that aids in describing or categorizing a certain system. In other words, a parameter is a component of a system that is crucial or useful for identifying the system or assessing its functionality, status, or other characteristics.

In some fields, such as mathematics, computer programming, engineering, statistics, logic, linguistics, and electronic music production, the term "parameter" has more precise definitions.

In addition to its technical applications, it also has broader meanings, particularly in non-scientific situations. For example, the terms "test parameters" and "game play parameters" refer to defining qualities or boundaries.

A novel method of characterizing surface texture, in particular surfaces having deterministic patterns and features, is the use of feature parameters.

Traditional methods for characterizing surface texture, such profile and areal field parameters, are considered as supplementary to the feature parameter approach.

Learn more about parameter, here

https://brainly.com/question/29911057

#SPJ6

the
Wnte
that
Program
will accept three
Values of
sides of a triangle
from
User and determine whether Values
carceles, equal atera or sealen
- Outrast of your
a
are for
an​

Answers

Answer:

try asking the question by sending a picture rather than typing

Explain the paging concept and main disadvantages of pipelined
approaches? Compare the superscalar and super pipelined approaches
with block diagram?

Answers

Answer:

PAGINACIÓN En la gestión de memoria con intercambio, cuando ... Debido a que es posible separar los módulos, se hace más fácil la modificación de los mismos. ... Ventajas y Desventajas de la segmentación paginada

Explanation:

What are some things you need to be careful not to change while editing a macro? Check all that apply. commas names spaces brackets rows columns

Answers

Answer:

commas, spaces, brackets

Explanation:

I hope this helps.

What is the name for the part of a camera which can block light when it's closed, and let light in when it's open?


Pixel


Lens


Focus


Shutter

Answers

dnt listen to the link stuff

g Write an UPDATE statement that changes the address for the row with vendor_id 4 so the suite number (Ste 260) is stored in vendor address instead of vendor address 1. Then, use SQL Developer to verify the change (you may need to click the Refresh button at the top of the Data tab to see the change). If this works correctly, go back to the tab for the UPDATE statement and click the Commit button to commit the change.

Answers

Answer:

UPDATE 'Vendors' SET 'address' = 'Ste 260' WHERE 'vendor_id' = 4

Explanation:

Required

Write an update statement

The question is incomplete as the table name is not given.

So, I will make the following assumptions.

Table name = Vendors

So, the update statement is:

UPDATE 'Vendors' SET 'address' = 'Ste 260' WHERE 'vendor_id' = 4;

The above statement queries the vendors table and changes the address  of vendor_id 4 from the initial value to Ste 260

Do you think renewable energy can power the world? If so, why?

Answers

Answer:

yes

Explanation:

Because it is a new safer and more energy efficient way of producing energy

what is the weather in Ireland?​

Answers

Answer:

Year-round, Irish weather as a whole tends to stick with what it knows: Mildly crisp weather, around 270 days of rain, with sunshine and wind. Although slight fluctuations occur dependent on the month and season, the average yearly temperature is around 50 degrees Fahrenheit.

Answer:

hi

Explanation:

The climate of Ireland is mild, humid and changeable with abundant rainfall and a lack of temperature extremes. ... January and February are the coldest months of the year, and mean daily air temperatures fall between 4 and 7 °C (39.2 and 44.6 °F) during these months.

have a nice day

I love Ireland very much

what of the following uses heat from deep inside the earth that generates steam to make electricity​

Answers

Answer:

Geothermal power plants.

If you notice that a worksheet displays columns A, B, C, E, and F, what happened to column D?

Answers

Answer:

HUDSU

Explanation:WGSDBHEUIWDBWJJ

Joseph learned in his physics class that centimeter is a smaller unit of length and a hundred centimeters group to form a larger unit of length called a meter. Joseph recollected that in computer science, a bit is the smallest unit of data storage and a group of eight bits forms a larger unit. Which term refers to a group of eight binary digits? A. bit B. byte O C. kilobyte D. megabyte​

Answers

Answer:

byte

Explanation:

A byte is made up of eight binary digits

jettison folk 2007, Magnum opus, be moving, offers poisoned commentary on the film industry.

Answers

Answer:

I'm a little confused...

Explanation:

Can you reword this please?

Which of the following describes the line spacing feature? Select all that apply. adds space between words adds space between lines of text adds space between paragraphs adds space at the top and bottom of a page adds bullet points or numerical lists

Answers

Answer:

adds space between lines of text

adds space between paragraphs

Explanation:

Design a program that asks the User to enter a series of 5 numbers. The program should store the numbers in a list then display the following data: 1. The lowest number in the list 2. The highest number in the list 3. The total of the numbers in the list 4. The average of the numbers in the list

Answers

Answer:

The program in Python is as follows:

numbers = []

total = 0

for i in range(5):

   num = float(input(": "))

   numbers.append(num)

   total+=num

   

print("Lowest: ",min(numbers))

print("Highest: ",max(numbers))

print("Total: ",total)

print("Average: ",total/5)

Explanation:

The program uses list to answer the question

This initializes an empty list

numbers = []

This initializes total to 0

total = 0

The following loop is repeated 5 times

for i in range(5):

This gets each input

   num = float(input(": "))

This appends each input to the list

   numbers.append(num)

This adds up each input

   total+=num

   

This prints the lowest using min() function

print("Lowest: ",min(numbers))

This prints the highest using max() function

print("Highest: ",max(numbers))

This prints the total

print("Total: ",total)

This calculates and prints the average

print("Average: ",total/5)

From the philosophical standpoint, especially in the discussion of moral philosophy or ethics, why do we consider “murder” or “killing” wrong or bad?

Answers

Explanation:

Morality is a set of values ​​and habits that a society acquires over time and can be categorized as good and bad values, right and wrong, justice and crime. Ethics is defined as the study of morals, the practical application of moral behaviors defined by society.

Therefore, the concept of "murder" or "killing" is seen as an immoral act by the vast majority of society around the world, strengthened by the set of moral conduct common to all human beings, which are the Articles on the Universal Declaration of Human Rights. Human Rights, which is an official document of the UN, which contains several universair and analytical rules on the rights of every individual, such as the right to life, security, freedom, etc.

Consider the following recursive method, which is intended to display the binary equivalent of a decimal number. For example, toBinary(100) should display 1100100.

public static void toBinary(int num)
{
if (num < 2)
{
System.out.print(num);
}
else
{
/* missing code */
}
}

Which of the following can replace /* missing code */ so that toBinary works as intended?

a. System.out.print(num % 2);
toBinary(num / 2);

b. System.out.print(num / 2);
toBinary(num % 2);

c. toBinary(num % 2);
System.out.print(num / 2);

d. toBinary(num / 2);
System.out.print(num % 2);

e. toBinary(num / 2);
System.out.print(num / 2);

Answers

Answer:

D) toBinary(num / 2);

System.out.print(num % 2);

Explanation:

Explain the term software dependability. Give at least two real-world examples which further elaborates
on this term. Do you think that we can ignore software from our lives and remain excel in the modern
era? What is the role of software in the current pandemic period?

Answers

Answer:

Explanation:In software engineering, dependability is the ability to provide services that can defensibly be trusted within a time-period. This may also encompass mechanisms designed to increase and maintain the dependability of a system or software.Computer software is typically classified into two major types of programs: system software and application software.

How does it relate
to public domain
and fair use?

Answers

Because the corilateion of the hippo

What is the name for the last word on a dictionary page?
a. Final word
c. First guideword
b. Second guideword
d. None of these
Please select the best answer from the choices provided

A
B
XOOO
с

Answers

Answer:

bbbbbbbbbbbbbbbbbbbbbbbb

Explanation:

bbbbbbbbbbbbbbbbbbbbbbbbb

Answer:

B.Second guideword

Explanation:

User defined blocks of code can be created in
Snap using the
feature.
A. make a block
B. duplicate
C. create
D. define a function

Answers

D....................

Basic python coding, What is the output of this program? Assume the user enters 2, 5, and 10.
numA = 0
for count in range(3):
answer = input ("Enter a number: ")
fltAnswer = float(answer)
numA = numA + fltAnswer
print (numA)
Thanks in advance!
:L

Answers

Answer:

17.0

Explanation:

I ran it for you. You could also try that (go to replit).

QUESTION 1
Which of the following is an example of firewall?
O a. Notepad
b. Bit Defender internet Security
O c. Open Office
O d. Adobe Reader

Answers

Answer is Bit defender Internet security
B

Which of the following describes a codec? Choose all that apply.
a computer program that saves a digital audio file as a specific audio file format
short for coder-decoder
converts audio files, but does not compress them

Answers

Answer:

A, B

Explanation:

Consider the following recursive method.

public static String doSomething(String str)
{
if (str.length() < 1)
{
return "";
}
else
{
return str.substring(0, 1) + doSomething(str.substring(1));
}
}
Which of the following best describes the result of the call doSomething(myString) ?

A
The method call returns a String containing the contents of myString unchanged.

B
The method call returns a String containing the contents of myString with the order of the characters reversed from their order in myString.

C
The method call returns a String containing all but the first character of myString.

D
The method call returns a String containing only the first and second characters of myString.

E
The method call returns a String containing only the first and last characters of myString.

Answers

Answer:

A

The method call returns a String containing the contents of myString unchanged.

Explanation:

Which of the following best describes the safety of blogging

Answers

we need the options

Generally safe, but there may be some privacy and security concerns. Therefore option B is correct.

While blogging can offer a relatively safe platform for expressing ideas and connecting with an audience, it is not entirely risk-free.

Privacy and security concerns can arise, especially when bloggers share personal information or discuss sensitive topics.

Cybersecurity threats, such as hacking or data breaches, can also compromise a blogger's personal information or their readers' data.

Additionally, bloggers should be mindful of online harassment and potential legal issues related to content ownership or copyright infringement.

Being aware of these risks and implementing best practices for online safety can help ensure a more secure and enjoyable blogging experience.

Therefore option B is correct.

Know more about Cybersecurity:

https://brainly.com/question/31928819

Your question is incomplete, but most probably your full question was.

Which of the following best describes the safety of blogging?

A. Completely safe, with no risks involved.

B. Generally safe, but there may be some privacy and security concerns.

C. Moderately safe, but potential risks exist, especially with sensitive topics.

D. Highly unsafe, with significant risks to personal information and security.

Complete the problem about Olivia, the social worker, in this problem set. Then determine the telecommunications tool that would best meet Olivia's needs.


PDA

VoIP

facsimile

Internet

Answers

Answer:

PDA is the correct answer to the following answer.

Explanation:

PDA refers for Programmable Digital Assistant, which is a portable organizer that stores contact data, manages calendars, communicates via e-mail, and manages documents and spreadsheets, typically in conjunction with the user's personal computer. Olivia needs a PDA in order to communicate more effectively.

The code below assigns the 5th letter of each word in food to the new list fifth. However, the code currently produces errors. Insert a try/except clause that will allow the code to run and produce of list of the 5th letter in each word. If the word is not long enough, it should not print anything out. Note: The pass statement is a null operation; nothing will happen when it executes.
food = ["chocolate", "chicken", "corn
fifth = []
for x in food:
fifth.append(x[4])

Answers

Answer:

Answered below

Explanation:

foods = ["chocolate", "chicken", "corn"]

fifth_char = []

for food in foods:

"""Add a try statement to add fifth character to fifth_char list. If word is not long enough, the except part executes the the pass and nothing is printed.""'

try:

fifth_char.append(food[4])

except:

pass

#print out elements in new list

for c in fifth_char:

print(c)

What is a font?
O How the text for a paragraph appears
O A display of text characters in a specific style and size
O Text that has been made bold
O Artistic elements you can add to text

Answers

A display of text characters in specific style
answer: a display of text characters in a specific style and size

How was the first computer reprogrammed

Answers

Answer:

the first programs were meticulously written in raw machine code, and everything was built up from there. The idea is called bootstrapping. ... Eventually, someone wrote the first simple assembler in machine code.

Explanation:

What is the difference between an information system and a computer application?

Answers

Answer:

An information system is a set of interrelated computer components that collects, processes, stores and provides output of the information for business purposes

A computer application is a computer software program that executes on a computer device to carry out a specific function or set of related functions.

Other Questions
please help me on this its reading What powers do American people have in the process of ratifying an amendment? Dont answer if you dont know the answer, but please help!! A user has a computer with a single SSD, and the entire drive contains one partition. The user wants to install a second OS on the computer without having to reinstall the current OS. The user wants to be able to select which OS to use when booting the computer. Which Disk Management tools should the user utilize to accomplish this task? Use the following Balanced Equation to complete the question: 2 Al + 6 HBr 2 AlBr3 + 3 H2If you have 10 moles of Al how many moles of H2 can be produced? *Will give Brainly!* How does each execution affect Elie? night chapter4 1. In a zoo, there were 36 exhibits, but k exhibits were closed. Write the expression for the number of exhibits that were open.2. The zoo is open for 9 hours on weekdays. On weekends, the zoo is open for r more hours. Write the expression for the number of hours the zoo opens on weekends.3. In the lion exhibit in the zoo, there are n lions. 3/5 of the lions are female. Write the expression for the number of female lions. The marked price of a radio is Sh. 12600. If the shopkeeper can allow a discount of 15% on the marked price and still make a profit of 25%.At what price did the shopkeeper buy the radio? BRAINLIEST FOR CORRECT ANSWER, IM FAILING SCHOOL AND NEED HELP ASAP. EVEN OFFICIAL HELP COUNTS what is a chemical equation Which of the following expressions results in 0 when evaluated at x = 3?(x + 3)(x + 12)(x + 20)(x - 3)-20x(x + 3)(x + 8)(x - 5) find the value of two numbers if their sum is 39 and their difference is 1 5. During the day a person consumes 360 calories from fruit. How many calories does this person consume in their daily diet? *PLEASE HELP A. 1,600B. 2,300C. 2,400D. 2,600 Define what a map is The perimeter of a rectangle is 20m and the length is x m find thearea of the rectangle in terms of x Which expression is equivalent What reasons does the author provide to support the claims How can people in the community demonstrate civic responsibility? Buy groceries Pay taxes Clean the house Mow the lawn What is the best Twenty One Pilots songShy AwayHometownJudgeNot Today In order to implement the classic DoS flood attack, the attacker must generate a sufficiently large volume of packets to exceed the capacity of the link to the target organization. Consider an attack using ICMP echo request (ping) packets that are 500 bytes in size (ignoring framing overhead). How many of these packets per second must the attacker send to flood a target organization using a 0.5-Mbps link