T 2412/22 (Adaptive driving models/STRADVISION) 27-11-2024
Download and more information:
METHOD AND DEVICE FOR PROVIDING PERSONALIZED AND CALIBRATED ADAPTIVE DEEP LEARNING MODEL FOR THE USER OF AN AUTONOMOUS VEHICLE
I. The appeal lies from the decision of the Examining Division to refuse the application.
II. The Examining Division found the main, the second and third auxiliary requests underlying its decision to lack inventive step over document
D1: US2018/0053102 A1.
It did not admit the other requests, namely the first and fourth auxiliary requests.
III. The Appellant requests that the decision of the Examining Division be set aside and that a patent be granted on the basis of a main request or of one of four auxiliary requests, as filed with the statement of grounds of appeal, and numbered 1', 2, 3 and 4'. The main, the second auxiliary, and the third auxiliary requests are identical with the corresponding requests underlying the decision. The first and fourth auxiliary requests amend the respective first and fourth auxiliary requests underlying the decision.
IV. In a communication accompanying a summons to oral proceedings the Board informed the Appellant of its preliminary opinion that all requests were not allowable for lack of inventive step.
V. The Appellant replied in writing without changing its requests. The present decision was taken in oral proceedings before the Board.
VI. Claim 1 of the main request defines:
A method for providing a deep learning model, to thereby support at least one specific autonomous vehicle to perform an autonomous driving according to surrounding circumstances, comprising steps of:
(a) a managing device which interworks with each of autonomous vehicles driven by each of legacy deep learning models pre-trained, if a video data transmitted from the specific autonomous vehicle among the autonomous vehicles is acquired through a video storage system, instructing a fine-tuning system to acquire a deep learning model corresponding to a specific legacy deep learning model of the specific autonomous vehicle to be updated by using the video data from a deep learning model library storing one or more deep learning models;
(b) the managing device inputting the video data and its corresponding labeled data to the fine-tuning system as training data, to thereby update the deep learning model by re-training the deep learning model with the training data including the video data and the labeled data; and
(c) the managing device instructing an automatic updating system to transmit the updated deep learning model to the specific autonomous vehicle, to thereby support the specific autonomous vehicle to perform the autonomous driving by using the updated deep learning model other than the specific legacy deep learning model;
wherein the managing device acquires the labeled data by inputting the video data to at least one of an auto-labeling system and a manual-labeling system, and
wherein the auto-labeling system applies an auto-labeling operation, using a certain deep learning model for labeling acquired from the deep learning model library, to the video data, to thereby generate at least part of the labeled data, and the manual-labeling system distributes the video data to each of labelers by using a distribution algorithm and acquires outputs of the labelers corresponding to the video data, to thereby generate at least part of the labeled data,
wherein, at the step of (a), the managing device instructs the deep learning model library to find at least one among the deep learning models whose relationship score in relation to the video data is larger than a threshold, and to deliver it to the fine-tuning system as the deep learning model, and wherein relationship scores are calculated by using at least part of video subject vehicle information, video subject time information, video subject location information and video subject driver information of the deep learning models.
VII. Claim 1 of the first auxiliary request differs from that of the main request by the following feature added at the end of the claim:
wherein the deep learning models have the information as their tagged data.
VIII. In claim 1 of the second auxiliary request the last part of claim 1 of the main request ("wherein, at the step..") is replaced to define instead the following:
wherein the managing device instructs
(i) a label-validating system to perform a cross-validation by comparing each of parts of the labeled data generated by each of the auto-labeling system and the manual-labeling system, finding one or parts thereof whose similarity scores are smaller than a threshold, to thereby generate feedback information,
(ii) the auto-labeling system and the manual-labeling system to determine whether to adjust said parts of the labeled data by using the feedback information and adjust said parts of the labeled data when the feedback information indicates a necessity of an adjustment, wherein the auto-labeled data or the manual-labeled data suspected of having labeling errors are relabeled
(iii) and the label-validating system to deliver final labeled data which has been validated by the label-validating system to the fine-tuning system, wherein the final labeled data is generated by integrating the relabeled auto-labeled data and the relabeled manual-labeled data.
IX. Claim 1 of the third auxiliary request reinserts at the end of claim 1 of the second auxiliary request the features previously deleted from the main request ("wherein, at the step..").
X. Claim 1 of the fourth auxiliary request adds to claim 1 of the third auxiliary request the same feature which was added in the first auxiliary request.
The application
1. The application relates to the provision of adaptive deep learning models calibrated and personalized for ("users" of) autonomous vehicles.
1.1 According to the application, known autonomous vehicles use "legacy" deep learning models trained by using data collected per country or region, and this is not satisfactory for drivers with different "tendencies", which the board understands to refer to driving behaviours. The application therefore proposes that the learning models be customized (see page 1).
1.2 In order to do this, the system maintains a collection of legacy models. For any individual vehicle, a suitable legacy model is selected for fine tuning. The legacy model is selected to have been trained for "video" conditions (i.e. vehicle type, place, time, or driver) similar to those applying to the vehicle of interest. The tuning is realised by further training the legacy models with data collected from the vehicle of the specific user (see page 2). This data is labelled by a combination of automatic and manual labelling (see page 3).
Document D1
2. D1 relates to adapting previously trained driver prediction models using local data (paragraph 2). Local data may be user specific, location specific or "moving platform" specific (paragraph 31). A "stock machine learning-based driver action prediction model" is adapted during operation of a vehicle using a model adaptation engine (see e.g. paragraph 13).
2.1 Driver actions are labelled by automatic classification or by "hand labeling coupled to a classifier", on the basis of vehicle sensor data (paragraph 78). Based on these labels and what is called "prediction data", e.g. data captured from the environment, the chosen stock classifier is adapted (retrained) to produce an individualised driver prediction model (see e.g. paragraphs 77 and 103).
Main request
Differences to D1
3. In its communication the Board identified the following set of differences between claim 1 and D1:
(a) the fine tuning taking place on a server ("managing device") rather than on the vehicle, and the subsequent transmission of the customised model to the vehicle
(b) the existence of a library of legacy models for specific vehicles
(c) the selection of one model for updating based on a relationship score determined using video data information
(d) a certain data labelling scheme, as recited in the penultimate claim paragraph.
4. In its reply to the communication of the Board, and in the oral proceedings, the Appellant discussed the conceptual differences between D1 and claim 1 but did not contest the set of differences between claim 1 and D1 as identified by the Board.
Obviousness
5. In its communication the Board indicated to the Appellant that it tended to agree with the Examining Division that all differences were obvious starting from D1.
5.1 Difference (a) was an obvious alternative for the skilled person who was aware of the trade-off between data transmission requirements and available computational resources.
5.2 Regarding differences (b) and (c), the Board noted that it appeared obvious that the skilled person would create different models for different vehicles, which suggested the provision of a library and of a corresponding selection step as claimed.
5.3 Regarding difference (d), the labelling method was so unspecific that a technical effect could not be acknowledged. Furthermore, D1 mentioned both automatic and manual labelling, so that the combination of both as claimed was obvious.
6. In the oral proceedings the Appellant did not contest the Board's assessment of difference (d) but focused on differences (a) to (c). The Appellant argued that the Examining Division's (and the Board's) analysis was ex post facto. Without knowledge of the invention the skilled person had no reason to modify D1 in a way so as to arrive at the claimed invention. D1 and the claimed invention were conceptually different and pursued different objectives.
6.1 The invention related to continuous learning of a deep learning model for a specific autonomous vehicle. The model was retrained with specific video data for specific circumstances and stored in a library containing the various models. The storage of models retrained for various circumstances allowed for efficient fine-tuning through fast re-training with minimal data. The selection step based on video data information ensured that the proper model was selected and updated.
6.2 In contrast, the focus of D1 was to develop a real-time solution suitable for onboard use. The Appellant referred inter alia to paragraphs 6, 16, 31, and to claim 17, all of which mentioned real-time adaptation. The solution of D1 was one in which a (single) generic stock model was adapted to a driver on the vehicle itself, during the operation of the vehicle.
6.2.1 For this reason, there was no need for a library in D1. The Examining Division merely stated that the stock model had to be stored somewhere and concluded that this already disclosed a library. The Appellant disputed that storing a stock model implied a library. But even if that were the case, there was no need for a selection step from a "library" with a single entry, especially one based on a relationship score taking into account video data information as claimed.
6.2.2 The skilled person also had no reason to perform the model adaptation on a server. There was enough computing power on a vehicle to perform re-training, and the need to communicate with a server might compromise real-time adaptation. In fact, the real-time requirement of D1 taught away from a centralized solution. Sending video data, waiting for computation and receiving the adapted model caused time delays which did not allow real-time adaptation.
7. The Board remarks first that the Appellant's conceptual presentation of the invention (see 6 and 6.1 above) does not entirely correspond to the claimed invention, which is less detailed and therefore of broader scope.
7.1 In particular, the continuous learning aspect is not part of the claimed invention. The library is not defined to be dynamic in content, because the claim does not specify a storage of the updated model in the library.
7.2 Also, the step of selecting a model from the library is very broadly formulated. It merely states that a relationship score needs to be larger than a threshold, and that the relationship score is calculated using at least part of video data information, which includes vehicle, location, time, and driver information. A selection based e.g. only on the vehicle type (or location, or time or etc.) is within the claim scope.
7.3 The claim therefore covers a method for providing a deep learning model to an autonomous vehicle based on a static library of deep learning models for different vehicles, from which a managing device selects a model corresponding to the vehicle in question, retrains it using the vehicle video data, and transmits it to the vehicle.
8. The Appellant argued that a library of models was not needed in D1. In the Board's view, although the library may not be strictly necessary, it is something that the person skilled in the art would certainly consider.
8.1 Document D1, as already explained above, relates to predicting driver actions and proposes "adapting previously trained models to specific circumstances using local data" (paragraph 2). This local data may be inter alia "moving platform specific" (paragraph 31).
8.2 D1 is concerned with a variety of "moving platform" types, such as an automobile or a bus (paragraph 41). The person skilled in the art knows that these different types of moving platforms are driven differently, and that they possess different sensor and/or actuator configurations (see again paragraph 41).
8.3 Because of that, the person skilled in the art will consider using different "previously trained models" (in the words of paragraph 2) for each type of vehicle, i.e. a library of models as claimed.
8.4 If these are not available, the person skilled in the art will consider adapting a generic model to the different types of vehicles before adapting for specific drivers of these different types. As it would be inefficient to reproduce the same adaptation for every vehicle of the same type, the person skilled in the art would store the adapted models and make them available to be used for other vehicles of the same type.
8.5 A selection step is then necessary in order to provide the correct model for the specific vehicle considered.
9. Thereby the person skilled in the art implements a method for providing a deep learning model to an autonomous vehicle which uses a static library of deep learning models for different vehicles, from which a managing device selects a model corresponding to the vehicle in question.
10. Following D1, this model is provided to the vehicle and, as the Appellant argued, adapted to the driver onboard the vehicle.
10.1 However, for inventive step, the question is not what D1 discloses, but how the person skilled in the art would modify it, e.g. in order to improve it.
10.2 In general, the person skilled in the art would consider well-known alternatives. In the case in hand, this applies to adapting the model on a central device and sending the updated model to the vehicle. There are, in the Board's view, good reasons for doing this, in particular the fact that more computational resources may be - and generally are - available on the server, and that this way the on-board computer, with necessarily limited resources, is free to perform other tasks.
10.3 Indeed this requires data transmission, but the trade-off is known to the person skilled in the art, who would choose one of the two options depending on the circumstances.
11. The Appellant argued that D1 focused on real-time adaptation during the operation of the vehicle and that this taught away from a centralized solution, which did not allow for real-time adaptation.
12. The Board is not convinced by these arguments, for multiple reasons.
12.1 First, while D1 is concerned with real-time adaptation, in the Board's view this is not the only, or even the main, focus of the teaching of D1. D1 is primarily concerned with providing an "adaptable model that can benefit from both past data collection and adapt to a custom set of circumstances" (paragraph 5, see also the beginning of paragraph 31). This is discussed before real-time processing is mentioned. Real-time adaptation is desirable but not a necessary, indissociable, part of D1's teaching.
12.2 Second, centralized computation may offer the same level of "real time" as adaptation on the device, because fast computation on a server can make up for data transmission delays.
12.3 Third, D1 does not precisely define what is meant by real time other than "during operation of the vehicle" (see e.g. claim 17). On the one hand, this means that an adaptation period so that the model is adapted for the next start may be good enough. On the other hand, such a vague requirement is not one that the person skilled in the art would feel bound to follow. In other words, a vague real-time requirement does not teach away from considering the claimed alternative.
12.4 The Board is therefore convinced that the person skilled in the art would consider the alternative of performing the adaptation on a central server rather than onboard the vehicle.
13. Considering in particular points 5.3, 6, 7.3, 9 and 12.4 above, the Board concludes that the person skilled in the art would arrive in an obvious manner at subject matter falling within the scope of the claimed invention. Therefore, claim 1 of the main request lacks inventive step (Article 56 EPC).
Auxiliary requests
Admittance of auxiliary requests 1' and 4' (Article 12(4) RPBA)
14. These requests amend the corresponding requests before the Examining Division so as to remove wording objected to by the Examining Division as lacking clarity (see decision points 14 and 23.1). This amendment does not raise any other issues. The Board therefore admits these requests.
Inventive step
15. In the auxiliary requests the Appellant added, separately or in combination, two different sets of features.
16. In the first and fourth requests it is defined that "the deep learning models have the [video data] information as their tagged data".
16.1 According to the Appellant this simplifies and speeds up the model selection step.
17. The Board remarks that once a library of models for various types of vehicles is defined (see point 9 above), the models need to be indexed by type, so that they can be differentiated and retrieved. This indexing implies a "tag" of some form as claimed.
17.1 Hence this amendment cannot change the assessment as to inventive step.
18. Claims 1 of auxiliary requests 2, 3 and 4' (also) define a cross-validation step between the manual and the automatic labeling.
18.1 The Examining Division was of the opinion that due to the lack of technical details no technical effect can be associated with this feature.
18.2 The Appellant argued in the statement of grounds of appeal that the new features provided the technical effect of improving the efficiency of the system by increasing labeling accuracy (see, e.g., the grounds of appeal, sections III.2 and III.3).
19. In its preliminary opinion the Board agreed with the Examining Division, because the new set of features defined neither how to determine a degree of difference between labels (see also the minutes of the oral proceedings before the Examining Division at the top of page 2) nor how to determine which labels are erroneous (the automatically or manually provided ones) or how to correct them.
19.1 The Appellant did not dispute that this information was missing from the claims, and even from the application as a whole, but argued that the person skilled in the art would know how to implement the claimed steps so as to obtain the desired technical effect.
20. However, if this were the case, it would also imply that the cross-validation would be obvious for the skilled person: Given that D1 already specifies a combination of automatic and manual labeling (paragraph 78, see 2.1 above), the person skilled in the art, knowing how to improve accuracy by cross-validation between automatic and manual labeling, would use cross-validation without exercising any inventive skill.
21. The Appellant also submitted that to arrive at the invention starting from D1 a number of modifications were needed. There was no reason for the person skilled in the art to perform all of them. The added features, in particular in the fourth auxiliary request, further increased the already large number of differences over D1.
22. The Board remarks that the number of differences over a certain piece of prior art is neither decisive nor a reliable indicator for the presence of an inventive step.
22.1 First, the number of differences itself may be, and often is, deceiving. One modification may imply or make obvious several other differences. For instance, as in the case in hand, performing the computations on a server instead of on the user vehicle, implies data transmission, and with it a host of other associated "differences" which may or may not be specified in a claim, like an antenna, a transmission protocol etc. A library implies storage, indexing, a retrieving mechanism and so forth. Also, in complex systems it is very easy to accumulate a large number of individual differences while simply considering the different options available to the person skilled in the art.
22.2 Secondly, whether several modifications combine to provide an inventive overall contribution does not depend on their number. For instance they may be obvious solutions to independent, "partial problems".
23. Ultimately, the claimed invention must contain a (new and) non-obvious technical teaching. The Board does not see such a non-obvious teaching reflected in any of the requests on file.
24. The Board concludes that the auxiliary requests, as the main request, are not allowable for lack of inventive step (Article 56 EPC).
For these reasons it is decided that:
The appeal is dismissed.