Example 1: infologistix GmbH

This example scrapes infologistix. Intended usage is crawling company consulting services from the webpage.

Utilizes infologistix/docker-python-selenium:alpine as image to run of.

class main.InfologistixCrawler(url: str, headless: bool = True)[source]

Bases: object

Example Crawler for infologistix homepage. Crawles services of the webpage and returns them.

Parameters
  • url (str) – the url to scrape

  • headless (bool, default: True) – set to true when running in headless environment

Examples

>>> crawler = InfologistixCrawler(url="https://infologistix.de", headless=False)
>>> print(crawler.run())
close() None[source]

Closes the driver.

getServices() list[source]

Scrapes the services of the webpage from infologistix GmbH

Returns

list – unsorted list of dict-like service structures

makeFrame(services: list) pandas.core.frame.DataFrame[source]

Converts the list into a human readable table format.

Parameters

services (list) – unsorted list of services

Returns

pd.DataFrame – table friendly services

run() pandas.core.frame.DataFrame[source]

Runs the Crawler and performs actions in the right order.

Returns

pd.DataFrame – table friendly services

main.sendMSTeams(webhook: str, message: str, title: str) Literal[True][source]

Send a message to a Teams channel. Needs a configured webhook for MS Teams.

Parameters
  • webhook (str) – webhook URI to connect to

  • message (str) – a message. Can be Text, Markdown or HTML

  • title (str) – the messages title

Returns

Literal[True] – message was sent