Opinion
December 23, 2024

Lightricks-Shutterstock Partnership Sets New Standards for AI Training Data Licensing

Lightricks-Shutterstock Partnership Sets New Standards for AI Training Data Licensing

Lightricks, a leader in the artificial intelligence content creation software space, has announced that the company has closed a new deal to gain access to Shutterstock’s vast library of video assets under a novel “research license” that enhances AI training efforts without any issues around copyright infringement. 

The new license is designed to help AI startups get around one of their biggest challenges – namely, the lack of accessible data to train their models. Rather than resorting to the legally and ethically questionable tactic of “scraping” of data from the public internet, as others have been known to do, Lightricks is authorized to train its powerful LTXV video model on a smaller, highly curated set of videos. 

This will help Lightricks nurture the LTXV model’s ability to achieve more consistent motion and structure over long video segments while maintaining quality and precision, which has proven to be an immense challenge in the industry.  

Lightricks’ Rapid Rise to the Forefront of AI Video

The announcement underscores the dynamism that has propelled Lightricks into the ranks of innovative tech companies trying to disrupt a generative AI industry that’s dominated by multinationals. 

Lightricks, which began life as the developer of a popular consumer photo-editing app called Facetune, aims to transform itself into an AI powerhouse, with a particular focus on aiding marketers and filmmakers. 

The company took major steps in this direction when it acquired the influencer-focused marketing platform Popular Pays, which the Lightricks product teams have subsequently augmented with AI tools for building creative briefs and vetting creators with brand safety concerns in mind. 

With the early 2024 debut of LTX Studio, a powerful video storytelling web app for creators that’s now powered by the open-source LTXV video generation model, Lightricks went to market several months before OpenAI was able to make its own video generation app, Sora, available to the public. 

Despite beating OpenAI, Google and Meta to the punch, Lightricks has many other rivals to contend with in AI video generation, where it competes against startups like Pika Labs, Luma and Runway. 

Uproar Over Unethical AI Training Methods 

With so many companies battling it out in the arena of AI video generation, the deciding factor for many creative teams will likely be the capabilities of their underlying large language models. The only way to make an LLM more powerful is to keep training it, and that requires vast amounts of data. 

The thirst for more training data has led to intense controversy, with many AI companies opting to scrape the internet without any consideration of the copyright issues pertaining to the use of that data. But the practice is extremely contentious. While the likes of OpenAI and Anthropic argue that scraping data is necessary to advance AI for everyone, artists, creators, writers and media companies complain they’re being exploited and insist what they’re doing is illegal. 

It has resulted in a burning debate – should AI firms be allowed to consume any copyrighted works they want to accelerate the development of their models, or should they be required to obtain permission and compensate those who created them? 

Former OpenAI researcher Suchir Balaji came to the latter conclusion, after overseeing that company’s efforts to amass training data. He has said it involved scraping everything from pirated book archives to user-generated content on social media and even content hidden behind paywalls, without ever obtaining permission. 

OpenAI made the assumption that if data was posted on the internet and freely available, it was fair game. 

Balaji initially went along with this stance, but he told the New York Times that he later began to question it, before concluding that the company was violating copyright law. He believes that technologies like ChatGPT cause harm to creators, and he said that was the main reason behind his decision to quit the company earlier this year. 

Data Scraping Is Clearly Unsustainable

While the New York Times, Sony Music and others are now resorting to legal action against many leading AI firms, we’ve seen what appears to be a tacit recognition from the industry that such data collection practices are not going to be sustainable in the longer-term. 

AI companies, ethicists and content creators appear to be reaching an understanding, and there has been a flood of reports about new content licensing deals that will give the former access to the latter’s material so they can train their AI models in a way that doesn’t raise legal and ethical questions. 

News publishers appear to be among the most enthusiastic, with the likes of the Associated press, Axel Springer, the Financial Times, News Corp., Le Monde, Time and the Atlantic all announcing that they’ve struck deals with OpenAI and rival firms. But it’s not only news organizations that are doing this. For instance, OpenAI has also secured a deal with Reddit to access its vast internet forums in search of ever-more training data. 

Shutterstock’s rival Getty Images has embraced this trend too, collaborating with the image generation startup Picsart to help it develop a new model that’s trained exclusively on its licensed content. They aim to appeal to marketers and small businesses with the concept of “commercially-safe AI-generated imagery.” 

Shutterstock itself is not new to this, and has been working with OpenAI for some time. It first agreed to let OpenAI train its DALL-E image generation model on its content in 2021, and last year extended that deal for another six years.  

Meanwhile, another leader in the video generation space, Runway AI, said in September it’s working with the Hollywood studio Lionsgate to train its models on that company’s extensive film and TV show library of more than 20,000 titles. 

A More Accessible Licensing Model

What’s different about Lightricks’ agreement with Shutterstock is that the AI company won’t have to pay through the nose to access the video library, thanks to the more flexible nature of its research license. 

Lightricks has explained that the deal allows the company’s researchers to optimize and refine its LTXV model using high-quality, licensed data. This research-focused approach to accessing data will enable it to develop LTXV in a more cost-effective way, and it’s a model that could potentially appeal to other AI startups that might be struggling to afford licensed data. 

The research license still ensures that contributors are compensated, too, earning 20% of the revenue from any data licensing deals, and they also have the option to opt out completely. 

Shutterstock’s global head of data licensing and AI Daniel Mandell told VentureBeat that the company created its research license because it believes financial investment shouldn’t be a barrier to innovation for startups. “The important message here is that companies, no matter their size or funding, no longer have an excuse to scrape unlicensed content for training purposes,” he said. 

Disclaimer

In line with the Trust Project guidelines, please note that the information provided on this page is not intended to be and should not be interpreted as legal, tax, investment, financial, or any other form of advice. It is important to only invest what you can afford to lose and to seek independent financial advice if you have any doubts. For further information, we suggest referring to the terms and conditions as well as the help and support pages provided by the issuer or advertiser. MetaversePost is committed to accurate, unbiased reporting, but market conditions are subject to change without notice.

About The Author

Gregory, a digital nomad hailing from Poland, is not only a financial analyst but also a valuable contributor to various online magazines. With a wealth of experience in the financial industry, his insights and expertise have earned him recognition in numerous publications. Utilising his spare time effectively, Gregory is currently dedicated to writing a book about cryptocurrency and blockchain.

More articles
Gregory Pudovsky
Gregory Pudovsky

Gregory, a digital nomad hailing from Poland, is not only a financial analyst but also a valuable contributor to various online magazines. With a wealth of experience in the financial industry, his insights and expertise have earned him recognition in numerous publications. Utilising his spare time effectively, Gregory is currently dedicated to writing a book about cryptocurrency and blockchain.

Hot Stories
Join Our Newsletter.
Latest News

From Ripple to The Big Green DAO: How Cryptocurrency Projects Contribute to Charity

Let's explore initiatives harnessing the potential of digital currencies for charitable causes.

Know More

AlphaFold 3, Med-Gemini, and others: The Way AI Transforms Healthcare in 2024

AI manifests in various ways in healthcare, from uncovering new genetic correlations to empowering robotic surgical systems ...

Know More
Read More
Read more
Tokenization, Wallets, and Gaming – Catalysts for Ethereum’s Second Decade
Opinion Markets Software Technology
Tokenization, Wallets, and Gaming – Catalysts for Ethereum’s Second Decade
December 23, 2024
Missed Bitcoin’s Rise? Here’s What You Should Know
Opinion Business Markets Technology
Missed Bitcoin’s Rise? Here’s What You Should Know
December 20, 2024
The Explosive Rise of Crypto Theft in 2024 with North Korea Leading the Charge
Opinion Business Markets Software Technology
The Explosive Rise of Crypto Theft in 2024 with North Korea Leading the Charge
December 20, 2024
Over One Billion Dollars Vanishes in Crypto Market Shakeup as Traders Are Caught Off Guard by Sudden Volatility
Opinion Business Markets Technology
Over One Billion Dollars Vanishes in Crypto Market Shakeup as Traders Are Caught Off Guard by Sudden Volatility
December 20, 2024