On July 5, 2024, the ideal car announced at the 2024 intelligent driving summer conference that it will push the “nationwide drive” non-map NOA to all ideal AD Max users in July, and will launch fully automatic AES (automatic emergency steering) and omni-directional low-speed AEB (automatic emergency braking) in July.
At the same time, ideal car has released a new self-driving technology architecture based on the end-to-end model, VLM visual language model and world model, and launched the early Bird project of the new architecture.
In terms of intelligent driving products, map-free NOA no longer relies on high-precision maps or prior information, can be used in navigation coverage areas across the country, and brings a more silky detour experience with the help of spatio-temporal joint planning.
Map-free NOA also has the ability of ultra-long-range navigation, and it can still pass smoothly at complex intersections.
At the same time, Wutu NOA fully considers the boundaries of users’ psychological security, and uses decimeter-level microcomputers to bring a tacit and reassuring smart driving experience.
In addition, the upcoming AES function can trigger automatically without relying on human auxiliary torque, avoiding the risk of more high-risk accidents.
The omni-directional low-speed AEB expands the active safety risk scenario again, effectively reducing the occurrence of high-frequency rubbing accidents in the low-speed moving scene.
In terms of self-driving technology, the new architecture consists of an end-to-end model, a VLM visual language model and a world model.
End-to-end model is used to deal with conventional driving behavior, from sensor input to trajectory output through only one model, information transmission, reasoning calculation and model iteration are more efficient, and driving behavior is more anthropomorphic.
VLM visual language model has strong logical thinking ability, can understand complex road conditions, navigation maps and traffic rules, and deal with difficult unknown scenes.
At the same time, the autopilot system will carry out ability learning and testing in a virtual environment based on the world model.
The world model combines reconstruction and generation, and the test scenario constructed not only conforms to the real law, but also has excellent generalization ability.
Fan Haoyu, senior vice president of ideal Automotive products Department, said: “ideal Automobile always insists on polishing the product experience with users, pushing the first batch of thousands of experience users in May this year, and expanding the number of experience users to more than 10,000 in June.
We have accumulated more than one million kilometers of unmapped NOA mileage across the country.
After the full push of the uncharted NOA, 240000 ideal AD Max owners will use the current domestic leading intelligent driving products, which is a major upgrade full of sincerity.
” Lang Xianpeng, vice president of intelligent driving research and development of ideal car, said: “from the launch of full-stack self-research in 2021 to the release of the new self-driving technology architecture today, the self-driving research and development of ideal car has never stopped exploring.
With the combination of the end-to-end model and the VLM visual language model, we have brought about the industry’s first dual-system deployment on the vehicle side, and for the first time successfully deployed the VLM visual language model on the vehicle-side chip.
this new, industry-leading architecture is a milestone in the field of self-driving.
” The four capabilities of NOA without map are improved, and roads across the country are used efficiently.
The NOA without map, which will be launched in July, will bring four major capacity upgrades to enhance the user experience in an all-round way.
First of all, thanks to the overall improvement of perception, understanding and road structure construction ability, map-free NOA gets rid of the dependence on prior information.
Users can use NOA in cities with navigation coverage nationwide, and can even turn on the function in more special hutong narrow roads and country roads.
Secondly, based on the efficient space-time joint planning ability, vehicles can avoid and detour road obstacles more smoothly.
The joint planning of time and space realizes the synchronous planning of horizontal and vertical space, and plans all the drivable tracks in the future time window by continuously predicting the spatial interaction between self-car and other cars.
Based on the learning of high-quality samples, the vehicle can quickly select the optimal trajectory and perform the detour decisively and safely.
At complex urban junctions, the routing ability of Wutu NOA has also been significantly improved.
Map-less NOA adopts BEV visual model fusion navigation matching algorithm, perceives the changing edge, road arrow marks and intersection features in real time, and fully integrates lane structure and navigation features, which effectively solves the problem that complex intersections are difficult to be structured, and has the ability of ultra-long-distance navigation, and the traffic at intersections is more stable.
At the same time, the map-free NOA focuses on the user’s psychological safety boundary, and brings a more tacit and reassuring driving experience with decimeter-level micromanipulation capabilities.
Through the occupation network with the fusion of lidar and visual front, vehicles can identify irregular obstacles in a larger range and have higher perception accuracy, so as to predict the behavior of other traffic participants earlier and more accurately.
Thanks to this, vehicles can maintain a reasonable distance from other traffic participants, and the timing of acceleration and deceleration is more appropriate, which can effectively improve users’ sense of security when driving.
The active security capability is advanced, and the coverage scenario is expanded.
In the field of active safety, ideal Automobile has established a complete security risk scene database, and continuously improves the coverage of risk scenarios according to the frequency of occurrence and the degree of danger.
The fully automatic AES and omni-directional low-speed AEB functions will be launched for users in July.
In order to deal with the physical limit scenario where AEB can not avoid accidents, ideal car launched a fully automatic trigger AES automatic emergency steering function.
When the vehicle is running at a high speed, the reaction time left to the active safety system is very short.
In some cases, even if the AEB is triggered, the full braking of the vehicle can not stop in time.
At this time, the AES function will be triggered in time, without human participation in the steering operation, automatic emergency steering, avoid the target ahead, and effectively avoid accidents in extreme situations.
The omni-directional low-speed AEB provides 360-degree active safety protection for parking and low-speed driving scenarios.
In the complex underground parking environment, the obstacles such as posts, pedestrians and other vehicles around the vehicle all increase the risk of rubbing.
Omni-directional low-speed AEB can effectively identify forward, backward and lateral collision risks, timely emergency braking, and bring more reassuring experience for users’ daily use of the car.
The self-driving technology is innovative and the dual systems are more intelligent.
the new self-driving technology framework of the ideal car is inspired by the fast and slow system theory of Nobel laureate Daniel Kahneman to simulate the human thinking and decision-making process in the field of self-driving to form a more intelligent and human-like driving solution.
The fast system, or system 1, is good at dealing with simple tasks and is based on human experience and habits to deal with 95% of the conventional scenarios when driving a car.
Slow system, that is, system 2, is formed by human beings through deeper understanding and learning.
Logical reasoning, complex analysis and computing capabilities, which are used to solve complex or even unknown traffic scenarios when driving vehicles, accounting for about 5% of daily driving.
System 1 and system 2 cooperate with each other to ensure the high efficiency in most scenarios and the high upper limit in a few scenarios, which become the basis for human cognition, understanding of the world and decision-making.
The ideal car forms the prototype of the autopilot algorithm architecture based on the fast-slow system theory.
System 1 is implemented by end-to-end model and has the ability of efficient and fast response.
The end-to-end model receives the sensor input and directly outputs the trajectory to control the vehicle.
System 2 is realized by VLM visual language model.
after receiving the sensor input, after logical thinking, the decision information is output to system 1.
The self-driving ability of the dual system will also be trained and verified in the cloud using the world model.
High efficiency end-to-end model, the input of end-to-end model is mainly composed of camera and lidar, multi-sensor features are extracted and fused by CNN backbone network, and projected to BEV space.
In order to improve the representation ability of the model, the ideal car also designs a memory module, which has the memory ability of both time and space dimensions.
In the input of the model, the ideal car also adds vehicle state information and navigation information, which is encoded by the Transformer model, decodes the dynamic obstacles, road structure and general obstacles together with the BEV features, and plans the driving trajectory.
The multi-task output is realized in the integrated model, and there is no rule intervention in the middle, so the end-to-end model has significant advantages in information transmission, reasoning calculation and model iteration.
In actual driving, the end-to-end model shows more powerful general obstacle understanding ability, over-the-horizon navigation ability, road structure understanding ability, and more anthropomorphic path planning ability.
The algorithm architecture of the high upper limit VLM visual language model and the VLM visual language model consists of a unified Transformer model, which encodes the Prompt (prompt word) text with Tokenizer (word splitter), encodes the image and navigation map information of the forward-looking camera, and then carries on the mode alignment through the picture-text alignment module, finally carries on the unified autoregressive reasoning, and outputs the understanding of the environment, driving decision and driving track.
Transfer to the auxiliary control vehicle of system 1.
The VLM visual language model of the ideal car has 2.
2 billion parameters and has a strong ability to understand the complex traffic environment of the physical world, even in the face of unknown scenes experienced for the first time.
VLM model can identify road roughness, light and other environmental information, prompting system 1 to control the speed to ensure safe and comfortable driving.
The VLM model also has a stronger ability to understand the navigation map, which can cooperate with the vehicle-machine system to correct the navigation and prevent the wrong route when driving.
At the same time, the VLM model can understand the complex traffic rules such as bus lanes, tidal lanes and traffic restrictions in different periods, and make reasonable decisions in driving.
The world model of ideal car combines reconstruction and generation technology paths, real data is reconstructed by 3DGS (3D Gaussian sputtering) technology, and the generated model is used to supplement a new perspective.
In the scene reconstruction, the dynamic and static elements will be separated, the static environment will be reconstructed, and the dynamic objects will be reconstructed and new perspectives will be generated.
After re-rendering the scene, the 3D physical world is formed, in which the dynamic assets can be edited and adjusted arbitrarily to realize the partial generalization of the scene.
Compared with reconstruction, the generated model has stronger generalization ability, weather, lighting, traffic flow and other conditions can be customized to generate a new scene in line with the real law, which can be used to evaluate the adaptability of the autopilot system under various conditions.
The scene constructed by the combination of reconstruction and generation creates a better virtual environment for the ability learning and testing of the autopilot system, which makes the system have the iterative ability of efficient closed loop and ensure the safety and reliability of the system.
, return to the first electric network home page >.