🚗 #GateSquareCommunityChallenge# Round 1 — Who Will Be The First To The Moon?
Brain challenge, guess and win rewards!
5 lucky users with the correct answers will share $50 GT! 💰
Join:
1️⃣ Follow Gate_Square
2️⃣ Like this post
3️⃣ Drop your answer in the comments
📅 Ends at 16:00, Sep 17 (UTC)
OpenAI is in danger
Produced | Tiger Sniff Technology Group Author | Qi Jian Editor | Liao Ying
On August 7, another domestic AI start-up company released its own open source and free commercial AI model: XVERSE-13B. The company called Yuanxiang XVERSE was founded by Yao Xing, former vice president of Tencent and founder of Tencent AI lab.
Since Meta released the open-source LLaMA 2 series large-scale models for free commercial use in July, a new wave of "open source" is brewing in the AI large-scale model market.
On August 2, Wenxin Qianfan, an AI large-scale model platform under Baidu, announced the access to the full series of open-source models of LLaMA2. The number of large-scale models that can be called on the platform has increased to 33. Except for 3 Wenxin models, the other 30 models are all open-source. Models, including ChatGLM2, RWKV, MPT, Dolly, OpenLLaMA, Falcon, etc.
The day after that, Alibaba Cloud also announced to join the ranks of the open source model. The open source Tongyi Qianwen 7 billion parameter model includes the general model Qwen-7B and the dialogue model Qwen-7B-Chat. The two models have been launched on the Mota community. They are open source, free, and commercially available.
Interestingly, this positive attitude towards open source and openness began with Microsoft, the big owner of the closed source model ChatGPT. ** On July 18, Microsoft announced that it will cooperate with Meta to release the open-source commercial version of the LLaMA 2 model, providing enterprises with alternatives to OpenAI and Google models. **OpenAI's monopoly position in the AI large-scale model market seems to be being "targeted" by the entire industry, even its closest partners.
As the number one globally recognized large language model, OpenAI's GPT-4 is currently the only large language model with a large number of users willing to pay for it.
The top student in the class is usually not motivated to join the study group. Similarly, **OpenAI has no reason or motivation to open source. **
However, with the full open source of LLaMA 2, more and more developers have invested in Meta and various open source models. Just like Android uses open source to fight against iOS, a large number of open source AI models are actively bypassing the technical barriers of GPT-4 and surrounding OpenAI with an open source ecosystem. **
Why open source?
When OpenAI first launched the plug-in function, many people compared the AI model to the future Windows, iOS, and Android. Now, with the release of LLaMA 2, the AI large model is not only a function, but even the market structure is developing towards the direction of the operating system.
Initiated by LMSYS Org, an organization led by UC Berkeley, it is a ranking competition for large language models (LLMs); as of July 20, the latest version of the ranking has counted 40 large AI models, and the top five are still closed sources Model (Proprietary), which are three models of GPT-4, GPT-3.5-turbo and Claude. However, the following 34 models, except for Google's PaLM-Chat-Bison-001, are all open source models, 15 of which are non-commercial.
Although in terms of model capabilities, looking at the entire market, no model regardless of open source or closed source can dare to compete head-on with GPT-4. However, the Tigers couldn't stand up to the wolves, and they couldn't beat the big models of GPT-4. They chose to "change lanes and overtake" and use open source to seize the application ecology. This seems to be somewhat similar to Android's fight against iOS.
"Right now, all open source big models have one purpose, and that's marketing."
The founder of a domestic open-source large-scale model research and development company admitted to Tiger Sniff that the main reason for promoting open-source large-scale models and open-source Android systems is to grab the market for free. "Many big companies have released large AI models, or even just made an application based on an existing model, and then started to promote it with great fanfare. In fact, for users of basic large models, spending more money on advertising is more expensive than It’s really not open source for the model.” This is also the best way for AI companies to prove their strength.
First, open-source models are easier to evaluate than closed models. Because the code and datasets of the open-source models are publicly available, researchers can directly inspect the model's architecture, training data, and training process to conduct deeper analysis of the model to understand its strengths and weaknesses.
"Some AI large models seem to be very capable, but they are not open source, and you can only see the results of his output."
Compared with the open source model, the closed source model can only understand the advantages and disadvantages of the model through the performance evaluation of the model. This leads to the fact that the performance of closed-source models may be artificially exaggerated, or their shortcomings hidden. The transparency of the open source model can help developers gain a deeper understanding of the model and evaluate it more fairly.
For latecomers, there is another problem with the closed-source model: it is easy to question the originality of the technology. Many large model developers once told Huxiu, "For those models that are not open source, to put it bluntly, even if it is a shell LLaMA, or simply calls the ChatGPT interface in the background, who knows?"
When the first wave of domestic AI large models came out, such voices of doubt were widely circulated on the Internet. For those large AI models that are not open source, it is difficult to prove their innocence. In order to prove that they are not calling the ChatGPT API, some companies even moved out the reasoning server and pulled out the network cable to demonstrate on the spot.
Open source is undoubtedly one of the best ways to self-certify large AI models. But the real value of **open source is not self-certification ability, but to seize the ecology. **
"After the release of LLaMA 2, it will definitely seize the OpenAI ecosystem quickly." A large model developer told Huxiu that although GPT-4 is almost recognized by the industry as having the strongest capability, models after GPT-3 are not open source. Moreover, the openness of the API interface of GPT-4 is also very low, so there are many restrictions on the development of the GPT model. As a result, many developers choose open source models such as LLaMA. These open source models can not only fine-tune instructions, but also conduct research on the underlying model.
"LLaMA is definitely more popular among developers than OpenAI."
When LLaMA 2 was first released on July 19, there were more than 5,600 projects on GitHub with keywords including "LLaMA", and more than 4,100 projects including "GPT-4". Two weeks after its release, the growth rate of LLaMA is faster. As of press time, there are more than 6,200 "LLaMA" and more than 4,400 "GPT-4".
On the other hand, open-source models can be downloaded locally for privatized deployment, which facilitates AI training for commercial companies. The AI applications of such companies need to be trained based on their own business data, and the large-scale AI model deployed privately can protect data security to the greatest extent. At the same time, there are more choices of computing power for privatized deployment, whether it is cloud service, local deployment, or even distributed computing power of multiple IDCs, which greatly reduces the cost of model training and reasoning.
Although ChatGPT has harvested 100 million monthly active users in just two months, in the developer ecosystem, the speed at which the open source model seizes users' minds seems to be faster.
At present, many domestic AI companies have chosen to release open source models. These include the open source model ChatGLM-6B released by Zhipu AI, MOSS released by Fudan University, Wudao Tianying Aquila released by Zhiyuan Research Institute, and Baichuan-7B (13B) of Baichuan Intelligent, etc. Among them, ChatGLM-6B, an open source large model released by **Zhipu AI, has been downloaded more than 4 million times worldwide, and has received 32,000 stars on GitHub, 3,000 more stars than LLaMA. **
"If we don't make an open source model, the market will soon be full of LLaMA." An executive of an AI company that has launched an open source model told Huxiu that open source is an important step in the development of China's AI large model.
In fact, before the LLMs trend started, generative AI had already fought a battle of open source and closed source.
**With a large number of developers and product applications, the open-source Wensheng graph model Stable Diffusion almost pushed OpenAI's first closed-source model Dall-E 2 into a corner. Although users generally believe that Stable Diffusion's model capabilities are not as good as another closed-source product MidJourney, Stable Diffusion has seized a large number of Vincent graph markets by virtue of its open source and free attributes, and has become the most mainstream Vincent graph model. Its development company RunwayML And Stability AI has also received a lot of attention and financing. **
The open source model of LLaMA 2 seems to be intending to force OpenAI in the field of LLMs.
Open source with goods
LLaMA 2 is currently open source, all three models in the series: 7 billion, 13 billion and 70 billion parameter versions. However, there are also rumors in the market that "Meta actually has a version with larger parameters that has not been released. The next version may have a version with larger parameters, but it may not be open source."
It is worth noting that at present, many open source models are not all open source. In the Enlightenment 3.0 model released by Zhiyuan Research Institute, only the "Tianying" basic language model is open source; the ChatGLM released by Zhipu AI has only opened up a part of the series of models, and the larger 130 billion parameter model is still closed. source. **
Regardless of whether LLaMA 2 is "reserved" for larger models, the "free" form will undoubtedly accelerate the formation of Meta in the large model market and push it onto the "old road" of Android.
Through the open source ecosystem, the Android system has accumulated a large number of developers and users around the world. In terms of technology ecology, it has greatly checked and balanced the leading closed-source system iOS, and even formed its own monopoly in some markets. Since 2018, the European Union has imposed a fine of more than 4 billion euros on Google due to the monopoly mechanism of the Android system. From this sky-high fine, we can also see how profitable the open source Android system is.
According to a report by the research firm Sensor Tower, user spending on Google Play will be approximately US$53 billion in 2022, and this figure will increase to US$60 billion in 2023. According to a report released by another research institution, Statista, as of January 2022, there are about 140,000 applications in the Google Play Store.
At this stage, the open source AI model is obviously not yet as popular as mobile phones. However, even if AI is really as popular as mobile phones, giants like Meta will not easily let go of companies that have made a lot of money with LLaMA 2.
In the open source agreement of LLaMA 2, there is such a stipulation: **If the monthly active users exceed 700 million, you must apply for a license from Meta. Meta may, in its sole discretion, license to you, and you shall have no right to exercise any of these rights. **
At the same time, in addition to the closed-source version of the open source model and the application of AI large models, it can also help the computing power to "bring goods".
The first two manufacturers in China to promote AI large-scale models, Baidu and Ali, are both cloud manufacturers. The other two cloud vendors, Tencent Cloud and Huawei Cloud, although they do not have LLMs products like Wenxin Yiyan and Tongyi Qianwen, they are also continuing to shout about AI models. The main reason behind this is the "carrying effect" of the large model on the cloud.
"Announcing some actions on AI large-scale models is also jointly promoted by the market and customers. In the past few months, there have been too many customers who have come to ask about large-scale models." A Tencent cloud business leader told Huxiu that the computing power queued up , which is the best proof of the AI large model's ability to carry goods.
**The model does not need to make money, but the computing power must be profitable. **Alibaba opened up Tongyi Qianwen, and Baidu introduced 30 open-source models on the Wenxin Qianfan large-scale model platform. These two actions are to deliver "free" AI capabilities to users. Users who use the open source model no longer pay for AI, but as long as their AI runs on Alibaba Cloud and Baidu Smart Cloud, they will have to pay for computing power.
"AI should also go back to the idea of the cloud and make money from the cloud." Xin Zhou, general manager of Baidu Smart Cloud AI and Big Data Platform, said that the original intention of opening up the large model platform is to create value for customers' businesses. While creating value, it can enhance the stickiness of old customers and expand more new customers. This is of great help in expanding the scale effect of cloud vendors.
Free is more expensive
"10 million, that's about the same amount as the starting price for customizing a large model."
The founder of an open source large-scale model company gave a quote to the intermediary who came to consult on the phone.
"After the open source model has been recognized by users, you can talk to others about the service fee for custom development." After the founder hung up the phone, he explained to Huxiu that for a model like LLaMA 2, the development cost is at least what it costs Tens of millions of dollars. Therefore, the market he is targeting must be tens or hundreds of times the development cost.
From the current point of view, the best way for **AI companies to make money from open source models is to start with services. **
Fortunately, most users of AI large models need these services very much.
"The model is open source, free and commercially available. That means that starting from downloading the model, model deployment, training, tuning, and application development in actual applications, all the work needs to be done by yourself." An LLaMA application developer told Huxiu, most suppliers of closed-source models will provide training and deployment services, and can customize development functions according to user needs. But if you use an open source model, you have to do all these tasks yourself, no one will help you train, no one will find you computing power, and no one will help you customize development.
"Actually, manufacturers of closed-source models sell services." The person in charge of an online education institution who has done AI application research based on the LLaMA model told Huxiu, "The open-source model seems to be free, but during the deployment process, A lot of money still needs to be spent.” After adding the AI model, the cost of manpower and computing power in the IT department has risen significantly.
Although training and tuning based on open source models is not difficult for most IT personnel. However, in order to carry out in-depth research and development of the model, some technical reserves in terms of algorithms and AI are still required. As the concept of large-scale AI models is becoming more and more popular, the price of talents in this area is also rising.
"The increase in labor costs is actually floating, but the cost of servers and hardware is real. From investing in large models to now, our costs have increased by about 20%-30%." said the person in charge of the aforementioned online education institution At present, his organization is still in the stage of AI scene exploration, and the biggest difficulty is that it needs to experiment with each scene. "If one fails, replace it with another one. In this process, every step costs money."
In this regard, Huxiu asked a Baidu Smart Cloud insider about the service and cost of Baidu Wenxin Qianfan in terms of deployment. It will save money.**”
In fact, regardless of whether it is an open-source model or a closed-source model, the cost of model deployment is calculated on a per person/day basis, and there will be no essential difference in the computing power cost for subsequent training and reasoning. "**However, using the open source model to train, deploy, and develop by yourself will only make this process very troublesome." The Baidu insider said that the specific deployment cost depends on the specific project, which varies greatly. However, there is actually no essential difference between open source and closed source in terms of deployment and usage costs. **And from the perspective of data security, most closed-source models can also be deployed privately.
**At this stage, it is still difficult for AI to achieve inclusiveness. **
For most companies engaged in Internet business, they have their own IT R&D teams. When the big model comes, they will soon be able to form an "established" team to develop AI applications. But for many retail, traditional manufacturing, and service industries, digital transformation is a difficult problem. It is really difficult for them to study the training, deployment and reasoning of AI large models.
For these companies, the best AI product is a general-purpose AI plug-in. "What we need is just a customer service robot whose conversations don't look so idiotic. I asked me to learn how to train the model, which is a bit exciting." The business line manager of an e-commerce brand told Huxiu that in the past six months, he just listened to It is said that AI dialogue ability is stronger than before, but ChatGPT has not been tried yet. Although he is also willing to embrace new technologies, it is not enough motivation for him to spend time learning and invest money in AI now.
"Unless there is a plug-in in the platform or software I'm using now, I can use it just by picking it up, otherwise I won't think too much about spending money to upgrade the AI assistant immediately." Low.
"** Merchants need AI to do it, and it can be applied and empowered without feeling.**" SaaS manufacturer Weimob has created such an application WAI in digital marketing, which helps merchants call AI capabilities in the form of embedding existing applications. Provide merchants with AI dialogue and text and image generation capabilities based on large language models.
Openly connecting large models to SaaS service tools is somewhat similar to Baidu Wenxin Qianfan's model invocation. Although only interface calls and Finetune are made, it provides users with more, faster, and more stable AI landing capabilities.
"Open-source models can make it easier for users to get started, and now many open-source models are updated faster than major manufacturers." Weimob COO COO Yin Shiming believes that open source and openness can quickly deliver AI capabilities to users In your hands, what users really need is "plug and play" AI. **
For most users who are still in the stage of testing, experimenting, and experiencing AI large models, the threshold for open source models is obviously lower, and the start-up cost is almost zero.
Many users have used the open source model from the beginning, and they will continue to use it in the future. The deployment and training problems mentioned above are spawning a service industry chain for open source models.
Chen Ran In this wave of big model upsurge, the newly created OpenCSG is doing service business around the open source big model.
The large model service provided by OpenCSG is mainly aimed at the training and implementation of open source models for enterprises. From the selection of open source models, to hybrid distributed computing power, combined with business model training, and back-end application development, etc., it can provide services to enterprises.
"**Large models are similar to all SaaS in my opinion. The upstream and downstream industries will be gradually enriched, and customers will not only focus on model capabilities." Chen Ran believes that the ultimate demand of customers is not to find The most capable model, but better, easier, and simpler use of AI large models to serve his business.
Open source ecology around AI
In the entire AI industry chain, open source goes far beyond models. From research and development to deployment to application, almost every link is inseparable from the topic of open source.
**Algorithms, computing power, data, and AI three elements each require open source support. **
At the algorithm level, open source AI large models are at a relatively late stage. In early AI research and development, almost all AI models used machine learning frameworks, which are equivalent to building AI toolboxes. The current mainstream machine learning frameworks, including TensorFlow, Pytorch, and PaddlePaddle (flying paddle), are all open source frameworks.
At the data level, the Commen Crowl open source dataset is an important data source in the GPT model training process. At present, many institutions and data companies have released open source products on AI training data sets, including the COIG-PC data set of Zhiyuan Research Institute, and the DOTS-MM-0526 multimodal data set of Haitian AAC.
For dataset publishers, open source can not only enhance influence and brand value, but open source datasets can also collect positive feedback from the open source community to find and fix errors or inconsistencies in the data. This external review helps improve data quality while further enriching the publisher's product ecosystem.
"**Algorithm engineers often face the trouble of lack of data in research and development. High-quality data can bring qualitative improvements to model evaluation. **my country is currently facing the scarcity of high-quality data sets, which also hinders Chinese large models. The development of technology.” Haitian AAC is one of the training data providers of the open source model LLaMA 2, said Li Ke, COO of Haitian AAC.
** In terms of computing power, the biggest bottleneck in the development of AI, the open source chip framework is also stimulating the development of the industry. **
On August 4, Qualcomm announced the establishment of a joint venture with four semiconductor companies to accelerate the commercialization of chips based on the open source RISC-V architecture. There are currently three mainstream chip frameworks on the market: x86 used by Intel CPUs, Arm used by Nvidia GPUs, and RISC-V, an open source chip framework.
"RISC-V can provide a programmable environment. The chip development team can use RISC-V to do a lot of pre-processing and post-processing work, and can also add special accelerators or functional modules that meet user needs to meet user needs. "Gang Zhijian, senior vice president of marketing and business development at SiFive, said that the RISC-V ecosystem provides a wealth of options for chip research and development, which is of great help to the rapidly growing demand for AI chips today.
Arm and x86 have relatively closed ecosystems compared to RISC-V. **In the Arm ecosystem, users can only choose the limited options provided by Arm, while the RISC-V ecosystem has many companies participating, and there will be more product types and choices. **
The open source architecture is also stimulating the chip industry to accelerate competition. Gang Zhijian said: "** As a service provider of the open source chip architecture, we will also compete with other companies. But whether we win or other companies win, this kind of competition Ultimately, it will promote the prosperity and progress of the RISC-V ecosystem.**”
Although the RISC-V instruction set architecture is free and open source, the core IP formed by chip design manufacturers based on the secondary development of the RISC-V instruction set architecture has independent intellectual property rights and can be authorized through external fees. According to data from the RISC-V International Foundation, the number of members will increase by more than 26% year-on-year in 2022, and the total number of member units will exceed 3,180, covering 70 countries/regions, including Qualcomm, Intel, Google, Alibaba, Huawei, and UNISOC. Sharp and many other leading chip companies.
Open source is an advantage for RISC-V, but it also creates some problems. RISC-V has only more than 40 basic instruction sets, plus dozens of basic module extension instructions. Any enterprise and developer can use RISC-V for free to create chips with independent intellectual property rights.
However, open source, highly customizable and modular features also make the RISC-V ecosystem more fragmented and complex.
"After each chip research and development company upgrades the instruction set of RISC-V, it will actually produce a new architecture. It is called RISC-V, but different companies are not compatible with RISC-V, and the open source ecology is actually divided. .**" Lu Tao, President of Weiwei Technology and General Manager of Greater China, believes that the open source of chip architecture and software ecology are very important, but it is very difficult for different teams to find a balance between openness, customization and fragmentation. Test the wisdom and ability of the R&D team.
In addition, the Arm architecture has already produced GPUs, IPUs and other chips suitable for AI training and reasoning, and the technical ecology is more complete and mature. The original intention of RISC-V is to design CPUs. Although it is very open, the design of AI chips is still in the exploratory stage.
According to the research firm Counterpoint Research, by 2025, the cumulative shipments of RISC-V processors will exceed 80 billion, with a compound annual growth rate of 114.9%. By then, RISC-V will occupy 14% of the global CPU market, 28% of the IoT market, 12% of the industrial market, and 10% of the automotive market.
Qualcomm has already implemented RISC-V in microcontrollers in its Snapdragon 865 SoC in 2019, and has shipped more than 650 million RISC-V chips to date. At the AI Hardware Summit Forum in September 2022, Professor Krste Asanovic, the inventor of RISC-V, revealed that Google has started to use the RISC-V-based SiFive Intelligence X280 to develop its machine learning framework TensorFlow. TPU chips. **Prior to this, Google has carried out self-research work on the TPU chip architecture for more than 10 years.
Although it is difficult to develop RISC-V chips from scratch, the open source nature of RISC-V has given Chinese chips, which also started from scratch, a chance to survive in the blockade and monopoly. "From my perspective, China China's chip companies are the fastest growing in the world. Chinese chip companies are more aggressive and willing to face challenges." Gang Zhijian said that the Chinese market is the key to stimulating the development of the chip industry. China's chip market is huge. For example, China's automotive chip computing power demand has far exceeded that of the European and American markets. **With the increasing demand of Chinese enterprises for AI computing power, China's AI chip industry will definitely usher in more opportunities in the future.
Conclusion
In addition to commercial considerations, **open source can also help technical publishers optimize models. **
"ChatGPT is actually a victory of engineering." The success of today's large language model is actually based on repeated training and tuning of the model. If after the basic model is established, the model is promoted to the open source community, and more developers participate in the model optimization work, it will undoubtedly be of great help to the progress of the AI large model.
In addition, "open source large models can avoid reinventing the wheel." Lin Yonghua, vice president and chief engineer of Beijing Zhiyuan Artificial Intelligence Research Institute, said in an interview during the 2023 Zhiyuan Conference, assuming that everyone in ** comes from The research and development of general-purpose large-scale models requires a lot of computing power, data, and electricity. It is a complete reinvention of the wheel, which is not conducive to the rational use of social resources. **
For a non-profit organization like Zhiyuan Research Institute, no matter whether the model is open source or closed source, there may not be much commercial considerations. But for commercial AI companies, whether it is Microsoft, Google, Meta, OpenAI, or the domestic Zhipu AI and Baichuan Intelligent, any large AI model will definitely not only be for the purpose of "scientific research". **
Although OpenAI's products have an absolute advantage in technology, the ChatGPT ecosystem built in the form of plug-ins is weak in terms of ecological construction. In the open-source and closed-source disputes of AI, we may see a different pattern from the mobile operating system in the future.