mirror of
https://git.digitalstudium.com/digitalstudium/digitalstudium.com
synced 2023-12-29 08:06:35 +00:00
Initial commit
This commit is contained in:
@@ -0,0 +1,69 @@
|
||||
---
|
||||
title: "Python: How to load multiple web pages in parallel"
|
||||
date: "2022-05-15"
|
||||
---
|
||||
First you need to install an aiohttp package. To install aiohttp run the command:
|
||||
```bash
|
||||
pip install aiohttp[speedups]
|
||||
```<!--more-->
|
||||
The `[speedups]` suffix is needed to install aiohttp accelerating packages - aiodns and cchardet. Then create a main.py file with this code:
|
||||
```python
|
||||
import aiohttp
|
||||
import asyncio
|
||||
import socket
|
||||
|
||||
|
||||
async def fetch_urls(urls):
|
||||
resolver = aiohttp.AsyncResolver()
|
||||
connector = aiohttp.TCPConnector(resolver=resolver, family=socket.AF_INET, use_dns_cache=False)
|
||||
session = aiohttp.ClientSession(connector=connector)
|
||||
|
||||
async def fetch_url(url, session):
|
||||
async with session.get(url) as resp:
|
||||
print(resp.status)
|
||||
print(await resp.text())
|
||||
|
||||
tasks = [fetch_url(url, session) for url in urls]
|
||||
await asyncio.gather(*tasks)
|
||||
await session.close()
|
||||
|
||||
|
||||
loop = asyncio.get_event_loop()
|
||||
|
||||
urls = ['http://httpbin.org/get?key=value1', 'http://httpbin.org/get?key=value2', 'http://httpbin.org/get?key=value3']
|
||||
|
||||
loop.run_until_complete(fetch_urls(urls))
|
||||
|
||||
```
|
||||
Now you can run main.py file with the command:
|
||||
```bash
|
||||
python3 main.py
|
||||
```
|
||||
You will see this output:
|
||||
```plaintext
|
||||
200
|
||||
{
|
||||
"args": {
|
||||
"key": "value2"
|
||||
},
|
||||
"headers": {
|
||||
"Accept": "*/*",
|
||||
"Accept-Encoding": "gzip, deflate",
|
||||
...
|
||||
|
||||
```
|
||||
All three queries will be executed in parallel. You can add any urls to the `urls` list, for example:
|
||||
```python
|
||||
urls = ['https://yandex.com', 'https://google.com', 'https://yahoo.com']
|
||||
```
|
||||
|
||||
In order to make HEAD, POST, PUT, DELETE requests, just replace `session.get(url)` in your code with the appropriate method:
|
||||
```python
|
||||
session.post('http://httpbin.org/post', data=b'data')
|
||||
session.put('http://httpbin.org/put', data=b'data')
|
||||
session.delete('http://httpbin.org/delete')
|
||||
session.head('http://httpbin.org/get')
|
||||
session.options('http://httpbin.org/get')
|
||||
session.patch('http://httpbin.org/patch', data=b'data')
|
||||
|
||||
```
|
Reference in New Issue
Block a user