Available on Apify

/scrape

Scrapes and extracts structured data from any web page. Below is the Python code example.

from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "endpoint": "scrape",
    "url": "https://www.amazon.com/Cordless-Variable-Position-Masterworks-MW316/dp/B07CR1GPBQ/",
    "fields": {
        "name": "",
        "rating": "",
        "price": "",
        "brand": "",
        "key_selling_points": [],
    },
}

# Run the Actor and wait for it to finish
run = client.actor("zeeb0t/web-scraping-api---scrape-any-website").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

Please provide your API key and the URL you’d like to scrape. Then, include a JSON structure describing the data you want to extract.

You don’t need to match the JSON keys to the names or specific data on the webpage—just define how you want your data returned, and our AI will figure out the rest.

Response:

{
  "scrape":  < The populated JSON object that matches the structure you provided. >,
  "markdown": "< Markdown of the page which can be optionally saved for further analysis. >",
  "html": "< HTML of the page which can be optionally saved for further analysis. >"
}

/links

Scrapes and extracts links matching a description from any web page. Below is the Python code example.

from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "endpoint": "links",
    "url": "https://www.ikea.com/au/en/cat/quilt-cover-sets-10680/?page=3",
    "description": "individual product urls",
}

# Run the Actor and wait for it to finish
run = client.actor("zeeb0t/web-scraping-api---scrape-any-website").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

Please provide your API key and the URL you’d like to scrape. Then, include a description of the type of links you want to extract.

Response:

{
  "links": [< An array of URLs that match the description you provided. >],
  "markdown": "< Markdown of the page which can be optionally saved for further analysis. >",
  "html": "< HTML of the page which can be optionally saved for further analysis. >"
}

/next

Scrapes and extracts the 'next page' links from any web page with pagination. Below is the Python code example.

from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "endpoint": "next",
    "url": "https://www.ikea.com/au/en/cat/quilt-cover-sets-10680/",
}

# Run the Actor and wait for it to finish
run = client.actor("zeeb0t/web-scraping-api---scrape-any-website").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

Please provide your API key and the URL you’d like to scrape.

Response:

{
  "next": [< An array of all matched 'next page' URLs. >],
  "markdown": "< Markdown of the page which can be optionally saved for further analysis. >",
  "html": "< HTML of the page which can be optionally saved for further analysis. >"
}

/search

Scrapes and extracts relevant URLs from Google search results pages. Below is the Python code example.

from apify_client import ApifyClient

# Initialize the ApifyClient with your Apify API token
# Replace '<YOUR_API_TOKEN>' with your token.
client = ApifyClient("<YOUR_API_TOKEN>")

# Prepare the Actor input
run_input = {
    "endpoint": "search",
    "google_domain": "www.google.com.au",
    "query": "AVID POWER 20V MAX Lithium Ion Cordless Drill Set",
    "page": 1,
}

# Run the Actor and wait for it to finish
run = client.actor("zeeb0t/web-scraping-api---scrape-any-website").call(run_input=run_input)

# Fetch and print Actor results from the run's dataset (if there are any)
print("💾 Check your data here: https://console.apify.com/storage/datasets/" + run["defaultDatasetId"])
for item in client.dataset(run["defaultDatasetId"]).iterate_items():
    print(item)

Please provide your API key and the Google search domain you’d like to scrape. Then, include the search query and page number of the Google search results.

Response:

{
  "search": [< An array of relevant URLs for any of the promoted and organic search results. >],
  "markdown": "< Markdown of the page which can be optionally saved for further analysis. >",
  "html": "< HTML of the page which can be optionally saved for further analysis. >"
}

Error Handling

How to handle the error response code and message that can be returned.

{
  "error": true,
  "reason": "Missing required parameters. Please check and try again with required parameters."
}

If any required parameters are missing or an error occurs, all endpoints will return a JSON object containing an error message.

Can we help?

We're on Discord—let's chat!