The Notion webclipper only goes so far. The articles clipped with the extension don't include the ability to populate particular database properties and sometimes the body of the page doesn't populate either. I want to show you how I pieced together a little news stream project with the notion-py, and break down each element of my code to understand the basics of notion-py.
Why a personal news stream? As a news-junkie I wanted a place to web clip interesting headlines and articles into a calendar view but found the process of manually adding properties like "category," "publication date," "source url," etc. to be tiresome. I'm still in the process of completing this project but I thought I'd give a web scraping tutorial in the meantime.
Change view from table-to-gallery view → change properties settings to Card Preview:: Page Cover
Install the following: python, notion-py, md2notion, newspaper3k
Links you need to get started: Find your notion token_v2, and link to the table you want to manipulate in Notion.
from notion.client import NotionClient
from md2notion.upload import upload
import newspaper
from newspaper import Article
import os
import sys
client = NotionClient(token_v2="INSERT TOKEN V2 HERE")
cv = client.get_collection_view("INSERT TABLE LINK HERE")
def converttostr(input_seq, seperator):
final_str = seperator.join(input_seq)
return final_str
seperator = (", ")
url = "INSERT ARTICLE URL HERE"
toi_article = Article(url, language="en")
toi_article.download()
toi_article.parse()
toi_article.nlp()