AI Positive - Rich Skrenta from Common Crawl // AI Inside 1
January 25, 202401:01:12

AI Positive - Rich Skrenta from Common Crawl // AI Inside 1

On the premiere episode of the AI Inside podcast, hosts Jeff Jarvis and Jason Howell discuss AI copyright issues with Common Crawl Foundation's Rich Skrenta regarding news outlets limiting access to content they publish publicly, impacting the integrity of Common Crawl's internet archive. In recent years, the archive has been used by LLMs as AI training data, and the implications of restricting information have a dramatic impact on the data quality that survives.

INTERVIEW
- Introduction and background on AI Inside podcast
- Discussion of the recent AI oversight Senate hearing Jeff testified at
- Introduction of guest Rich Skrenta from Common Crawl Foundation
- Overview of Common Crawl and its goals to archive the open web
- Discussion of how Common Crawl data is used to train AI models
- News publishers wanting content removed from Common Crawl
- Debate around copyright, fair use, and AI's "right to read"
- Mechanics of how Common Crawl works and what it archives
- Concerns about restricting AI access to data for training
- Risk of regulatory capture and only big companies being able to use AI
- Discussion of recent court ruling related to web scraping
- Hopes for Common Crawl's growth and evolution

NEWS BITES
- Interesting device announcement from CES - Rabbit R1 with Perplexity AI integration
- Study on actual risk of AI automating jobs away in the near future



_____________________

AI INSIDE
SUBSCRIBE: http://www.youtube.com/@YellowgoldStudios?sub_confirmation=1
WEBSITE: http://aiinside.show
PATREON: http://www.patreon.com/aiinsideshow
TWITTER: http://www.twitter.com/AIInsideShow
INSTAGRAM: http://www.instagram.com/aiinsideshow
THREADS: https://www.threads.net/@aiinsideshow
MASTODON: https://mastodon.social/@aiinsideshow

_____________________

SUPPORT OUR WORK
AII PATREON: http://www.patreon.com/aiinsideshow
YELLOWGOLD STUDIOS PATREON: http://www.patreon.com/JasonHowell
BUY ME A COFFEE: https://www.buymeacoffee.com/yellowgoldstudios
AFFILIATE LINKS (thank you!): https://aiinside.show/affiliates

_____________________

OUR OTHER SHOWS
ANDROID FAITHFUL: http://www.androidfaithful.com
THIS WEEK IN GOOGLE: http://www.twit.tv/twig

_____________________

GET IN TOUCH WITH ME
EMAIL: https://aiinside.show/contact

BUSINESS AND SPONSORSHIP INQUERIES: jason(at)yellowgoldstudios(dot)com

_____________________

AFFILIATES
These are the tools we use to produce this show. If you click on our affiliate links below, we are going to receive a small commission. And MOST of the time, you will receive an offer too. So, you know, we both win! THANK YOU for supporting independent podcasting

Podcastpage: This is the tool we use to create our website. It was easy to spin the site up in a matter of a few days.
https://podcastpage.io/?via=yellowgoldstudios

Acast: Our podcast host, though Acast offers several other services to make podcasting easier for independents like us. Sign up through this link and get 25% off of your first two months with Acast.
https://acast.com/?utm_campaign=rewardful&utm_medium=web&utm_source=referral%20&via=yellowgoldstudios

Streamyard: What we use for the live technical production of AI Inside. Guests connect to Streamyard easily, and Jason has access to control the audio and video live switching. It's like a Tricaster in the cloud. Use this link to get $10 in credit toward your Streamyard account.
https://streamyard.com/pal/d/6142587533131776

Perplexity AI: The LLM we use to help craft copy for things like show notes, promotional materials, and more. Use this link and you'll get $10 worth of credit.
https://perplexity.ai/pro?referral_code=L4CS5QUR

OpusClip: An AI platform that analyzes full episodes of the podcast to pull out small video clips for social media marketing. This stuff would take hours to do without a service like OpusClip. Check it out!
https://www.opus.pro/?via=da6896