-
[hal-05230510] Seed Inference in Interacting Microbial Communities Using Combinatorial Optimization
The behaviour of microorganisms and microbial communities can be abstracted by models combining a description of their metabolic capabilities as metabolic networks, and suitable computational or mathematical paradigms that further integrate simulation conditions. A major component of the latter is the composition of the environment or growth medium that can be referred to as seeds. Predicting the seeds from the metabolic network and an expected behaviour is an inverse problem that can be addressed with linear programming or logic paradigms such as Answer Set Programming (ASP). Here, we formalise seed prediction for microbial communities, taking into account that their members may interact positively through metabolite transfers, which may reduce the need for external seed metabolites. We address the problem with ASP and add a hybrid component ensuring the satisfiability of linear constraints. We explore the subset-minimality solving heuristic of the Clingo solver and develop two heuristics supporting priority of seeds over transfers. We present a proof of concept of seed inference in small-scale communities, and assess the scalability of the three heuristics at genome-scale. Overall, our work introduces a hybrid logic-linear model for seed inference in interacting microbial communities, and new heuristics for the exploration of the solution space with subset minimality optimisations.
ano.nymous@ccsd.cnrs.fr.invalid (Chabname Ghassemi Nedjad) 29 Aug 2025
https://inria.hal.science/hal-05230510v1
-
[hal-04997560] Data paper: A goat behaviour dataset combining labelled behaviours and accelerometer data for training Machine Learning detection models
This paper presents a dataset of accelerometer data and corresponding video-annotated behaviours from eight indoor dairy Alpine goats. Animals were equipped with 3D-accelerometers attached to their ears for 24 consecutive hours and recorded at a frequency of 5 Hz. Video recordings for this period were also obtained. Activities associated with positional, feeding and social behaviours were annotated over two daylight periods, for a total of 11 hours per goat, by a trained observer assuring high precision and consistency. This dataset can be used independently or complement an existing dataset for training supervised Machine Learning models for the detection of goat behaviour. It contributes to improving the robustness of such models by incorporating behavioural signals specific to indoor-housed goats.
ano.nymous@ccsd.cnrs.fr.invalid (Sarah Mauny) 19 Mar 2025
https://hal.inrae.fr/hal-04997560v1
-
[hal-05264391] Method: An accurate method for detecting drinking bouts in dairy cows based on reticulorumen temperature
This study evaluated the performances of three methods for detecting drinking bouts in dairy cows using reticulorumen temperature (RT): the 'FixT' method based on a fixed RT threshold, the 'Cow-dT' method based on a cow-day-specific RT threshold, and the 'FallST' method based on RT fall slope. We observed the drinking behaviours of 28 dairy cows equipped with reticulorumenal sensors over 96 h to create a reference dataset. A total of 730 drinking bouts were observed. We matched detected drinking bouts against observed drinking bouts to obtain the number of true-positives, false-negatives, and falsepositives, and then calculated the detection performances of the three methods in terms of sensitivity (Se), positive predictive value (PPV), and F-score. The performances of the three RT-based methods (Se ≥ 90%, PPV > 96% and F-score ≥ 93%) were better than those from previous work using collarattached accelerometers, but slightly lower than methods using drinking troughs connected to electronic identification systems or methods combining accelerometers with geomagnetic sensors or with ultrawideband location. The FallST method showed slightly better performance (highest F-score) than the FixT and Cow-dT methods. The FallST method accurately detected drinking bouts lasting more than 30 s and at least 30 min apart, with a detection time accuracy of 10 min. The models using RT curve parameters failed to predict characteristics of the drinking bouts. In conclusion, the method developed here can accurately detect drinking bouts in dairy cows using RT, but without further characterisation of the drinking bouts (e.g. duration).
ano.nymous@ccsd.cnrs.fr.invalid (L. Aubé) 17 Sep 2025
https://hal.inrae.fr/hal-05264391v1
-
[hal-05178193] Spectral indices in remote sensing of soil: definition, popularity, and issues. A critical overview
Serving as a powerful proxy in remote sensing studies, spectral indices can generate meaningful environmental interpretation from either raw or atmospherically corrected spectral data, and characterise and quantify some important properties of various objects on Earth’s surface. However, while numerous spectral indices have been developed over time, since the very launch of civilian satellites until now, some critical issues in their usage, such as comparability, remain scarcely studied, which may lead to incorrect, inconsistent, and unreliable results. In this study, we collected 471 spectral indices of various environment components (vegetation, water, and soil) that might be leveraged for soil studies, and traced their popularity in scientific publications over the past decades. The bibliometric analysis revealed a growing interest and utilisation of spectral indices as Earthobserving satellite technology advanced. Based on both literature and, for sake of complementation and illustration, some targeted regional-scale case studies, we discuss the issues of naming confusion, comparability, applicability, accuracy trade-offs, and reproducibility of using spectral indices. Overall, this overview provides an extensive list of spectral indices, both soil indices and soil-related indices, that can be useful for characterising these environment components by remote sensing. It draws attention to some misuses and confusions that must be avoided to prevent scientific pitfalls. The comparisons between different spectral indices, sensors, and correction methods, highlight the confusing effects that the misuse and non-standardised practices of the spectral indices useful for soil, may have on soil property mapping and monitoring. Insights to the judicious and appropriate usage of spectral indices in the remote sensing of soil are provided.
ano.nymous@ccsd.cnrs.fr.invalid (Qianqian Chen) 24 Jul 2025
https://hal.inrae.fr/hal-05178193v1
-
[hal-05110984] Advancing agroecology and sustainability with agricultural robots at field level: A scoping review
Agricultural robots show a growing potential to improve resource management and reduce the environmental impacts of farming. However, the evaluation of robots’ contribution to support sustainable farming is still lacking. This study specifically reviewed the operationalization of four agroecological principles at the field level: recycling, soil health, biodiversity and synergy. To this aim, a scoping review was conducted on the Scopus database, with a query within titles, abstracts, and author keywords mentioning robots, and agroecology or sustainability. The body of literature was screened to include only open field robots. The resulting 78 documents were coded inductively on three macro areas: (1) academic background, (2) robot operations, (3) contribution to agroecology principles, whether explicitly or implicitly mentioned. The results highlight that robots operationalize agroecology principles through non-chemical and selective weeding to preserve diversity and soil health, lighter designs that reduce soil compaction, and advanced data collection systems to optimize resource use and synergy. Solar-powered robots represent early steps toward recycling, but this principle remains understudied. The discussion expands on the potential of robotics in other innovative approaches for sustainable agriculture, such as agroforestry, conservation agriculture, and novel farming system design. Key challenges include ensuring farmers are enabled to master data collection and management, as well as integrating high-tech robotics with low-tech solutions. These efforts are critical for leveraging agricultural robotics to advance agroecology and sustainability across diverse farming systems.
ano.nymous@ccsd.cnrs.fr.invalid (Mohammad Naim) 13 Jun 2025
https://hal.science/hal-05110984v1
-
[hal-05285601] Improving the prediction of soil organic carbon content using field-acquired hyperspectral data by accounting for soil moisture and surface roughness
Soil surface conditions such as moisture, roughness, and vegetation complicate accurate Soil Organic Carbon (SOC) prediction by altering spectral reflectance. Most studies consider these factors separately and under controlled conditions. Soil roughness has rarely been included [1,2], and typically not alongside soil moisture, which has mostly been studied in laboratory settings [3]. Common methods to reduce moisture effects on spectra, such as external parameter orthogonalization (EPO) and direct standardization (DS), rely heavily on lab-based datasets [3]. To address this, we assessed the influence of soil moisture and surface roughness as co-variables in models predicting SOC content from reflectance spectra of bare Luvisols near Versailles, France. Spectral data were collected under natural light at 76 points, along with volumetric soil moisture (θ) and 7 roughness indicators from photogrammetry [4]. SOC was predicted using Partial Least Squares Regression (PLSR) and Random Forest (RF), with 4-fold cross-validation repeated 10 times. Six wavelength-selection (WS) strategies were tested: two from satellite simulations (EnMAP, Sentinel-2), two from model variable importance (PLSR, RF), one expert-based, and one using all wavelengths. Moisture and roughness were added individually. In-field spectra enabled reasonably accurate predictions, with RF outperforming PLSR (SOC RMSE: 1.6–1.8 g.kg⁻¹). WS methods improved accuracy only when co-variables were added. Moisture had little effect, while roughness improved prediction quality in most cases, especially shadow percentage for PLSR and the semivariogram sill parameter for RF. These results highlight the benefit of including surface roughness to improve large-scale SOC prediction from remote sensing.
ano.nymous@ccsd.cnrs.fr.invalid (Hugues Merlet) 26 Sep 2025
https://hal.science/hal-05285601v1
-
[hal-05285648] Spectral models learn the context, not the soil: rethinking soc prediction from lab to drone measurements under field conditions
Predicting soil organic carbon (SOC) using spectral data remains a challenge in digital soil mapping, particularly under field-scale conditions where environmental factors (e.g., vegetation, moisture) can mask or distort soil reflectance [1]. These unstable conditions are a major obstacle to model generalization across space and time. In this study, we evaluated the ability of SOC prediction models, built from reflectance spectra from lab to field and drone measurements, to account for and generalize across varying environmental conditions, over a unique field plot structure located in Nouzilly (France). The experimental design consists of 3 replicates (block design) of 4 to 5 tillage practices (modality) within a single 11.25 ha field. Two sampling campaigns (Oct 2024, May 2025) provided SOC (0-5 cm) and spectral data from lab, field, and UAV platforms at 75 sampling points. Co-variables such as moisture content and soil surface roughness were also collected. To assess model generalizability to new spatial and temporal conditions, we applied several data-splitting strategies: random splits, leave-one-block-out, leave-one-modality-out, and time-based splits between the October and May datasets. Our results show that tillage modality alone induced significant SOC variability at the soil surface, with mean SOC ranging from 12.1 g/kg under conventional tillage to 16.7 g/kg under minimum tillage. Seasonal differences between October and May also contributed substantially to SOC variability, further complicating model generalization. In this context, co-variables related to soil roughness and moisture had no significant impact on improving model accuracy. Model performance was highly sensitive to data-splitting strategy. Random splits gave overly optimistic results (R² = 0.75, RPIQ = 2.7 for field spectra), whereas leave-one-modality-out failed to generalize to unseen tillage practices, with most models showing R² < 0. Leave-one-block-out yielded reliable performance for laboratory spectra but failed for UAV and field data, especially under reduced environmental variability (e.g., in May or after NDVI-based filtering), with R² dropping from 0.72 to 0.28 for October UAV measurements. These findings suggest that models often rely on indirect or ephemeral environmental features rather than direct or intrinsic spectral behaviour of bare soil resulting in unstable performance and poor transferability across space and time, even for similar soils.
ano.nymous@ccsd.cnrs.fr.invalid (Hugues Merlet) 26 Sep 2025
https://hal.science/hal-05285648v1
-
[hal-05261543] cMFA for multi-omics data integration in microbial community models
Understanding microbial community functions is challenging due to complex interactions and assembly mechanisms; however, advances in sequencing have enabled the collection of multi-omics data, including population counts and metabolomic or metatranscriptomic data. Our main objective is to develop a mathematical model capable of integrating time series of multiomics data at a community scale. We introduce the community metabolic flux analysis (cMFA) method, which generalizes metabolic flux analyses, using a list of time series data of experimentally measured production and consumption rates of metabolites and microorganism growth. We aim to infer, for each member of the microbial community, the intracellular distribution of metabolic fluxes. This is a high-dimensional constrained linear regression problem, informed by mass conservation constraints and metatranscriptomic data, encoded in the penalty term. The difficulty here is in accurately inferring latent internal rates from a few observations of exchange fluxes. We evaluated the cMFA method on synthetic data from dynamic models of increasingly complex microbial communities, based on metabolic models of different mutants of Escherichia coli using dynamic flux balance analysis (dFBA). Synthetic metatranscriptomic data were obtained from internal metabolic fluxes in the dynamic model. Different regularization terms were tested, including different levels of sparsity, for the selected penalty weight . To evaluate the robustness of the method, multiple benchmarks were tested. These included assessments of the robustness of the method to data noise, incomplete meta-transcriptomic data, inaccurate prior knowledge of metabolic import rates and expanding the study to a larger microbial community . Currently, we are working with real data ,including data on denitrification and cheese production .
ano.nymous@ccsd.cnrs.fr.invalid (Sthyve Junior Tatho Djeanou) 15 Sep 2025
https://hal.science/hal-05261543v1
-
[hal-05281103] Evaluating the potential of Sentinel-2 data to assess the coarse fragment cover of the soil surface within a Spanish vineyard
The presence of coarse fragments (CF) on the soil surface is a critical factor influencing the assessment of key soil properties such as hydraulic conductivity and C stocks, as well as erosion processes [1–3]. This study investigates the potential of Sentinel-2 (S2) data to estimate soil surface CF cover for an 82-ha trellis-trained vineyard (Burgos, Spain), with ~3 m-inter-row spacing. CF cover (%) was estimated using the point-count method via SamplePoint [4], based on nadir photos taken ~1 m height above ground level, at 60 points repeatedly during three field campaigns. Based on two S2 time series (Jan 2023–Feb 2024 and Jan–Apr 2023 (vine dormancy)), six spectral indices computed within a 30 m-buffer were clustered through hierarchical agglomerative clustering (HAC) and principal component analysis (PCA), which led to the selection of the Non-photosynthetic vegetation soil separation index (NSSI). Assessment of NSSI relevance relied on correlating NSSI values, extracted from S2 images closest to field campaign dates, with the average CF cover, with and without applying an NDVI threshold of 0.4. A Random Forest algorithm was then used to predict CF cover, with 70% calibration 30% validation split repeated over three random iterations. Two approaches were tested, with and without NDVI threshold: (1) S2 bands only, and (2) S2 bands + NSSI + NDVI. NSSI was moderately correlated with CF cover (R² = 0.47–0.60), while best correlated with NDVI threshold (R² = 0.48–0.77). Calibration performance was good across all models (R²&gt;0.6; RMSE&lt;16.75%; RPD&gt;1.62; RPIQ&gt;2.23), even though validation results were variable. NDVI thresholding alone did not improve validation, but adding NSSI+NDVI as predictors enhanced validation accuracy. The best performance was obtained by combining data from all campaigns using S2 bands + NSSI + NDVI without any NDVI threshold (R² = 0.42; RMSE = 17.53%; RPD = 1.55; RPIQ = 2.05).
ano.nymous@ccsd.cnrs.fr.invalid (Hayfa Zayani) 24 Sep 2025
https://hal.science/hal-05281103v1
-
[hal-05281203] Detection of soil management practices using Sentinel-1 time series: the challenges raised by the diversified management sequences in vineyards
Characterization of soil management strategies in the complex agroecosystems of vineyards is crucial to evaluate their impact on vineyards soil health, particularly in the context of increasing soil threats posed in semi-arid environments [1], [2], [3], [4]. This study evaluates the potential of Sentinel-1 (S1) radar times series to detect soil management practices in Spanish vineyards. Two trellis-trained vineyards plots (ca. 4 ha each) located in the Toledo province (Spain) were studied, each subjected to a distinct soil management practice: conventional tillage (TILL) and a cover cropping system (CC), respectively. A farmer survey was conducted to thoroughly document the sequence, timing, and spatial distribution of management operations carried out between October 2020 and August 2024. A methodology based on S1 radar signal change detection was applied to detect soil surface roughness associated with these practices. The survey data served as a reference to evaluate the accuracy of the S1-derived detections. Results revealed a very high degree of variability in vineyards management practices, in terms of type, spatial distribution and frequency within these fields. Despite such diversified management sequence, satellite-based detection was effective on average, over more than 60% of the plot surface area, for tillage for both TILL and CC plots, weed control and rolling for only CC plot. Additionally, mechanical pruning was successfully detected in the TILL plot. Our further research will explore the integration of S1 radar data with S2 optical imagery to refine this detection and assessment of soil management practices in viticultural systems.
ano.nymous@ccsd.cnrs.fr.invalid (Hayfa Zayani) 24 Sep 2025
https://hal.science/hal-05281203v1
-
[hal-05260643] Spectral indices in remote sensing of soil: definition, popularity, and issues. A critical overview
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Qianqian Chen) 15 Sep 2025
https://hal.inrae.fr/hal-05260643v1
-
[hal-05285538] Spatial prediction of soil properties using Sentinel-2 temporal mosaics of non-vegetated soils in a semi-arid region: A comparative evaluation of Google Earth Engine and THEIA platforms in Sminja
This study investigates the potential of Sentinel-2 (S2) temporal mosaics (TM) of non-vegetated soils for enhancing soil property mapping in the semi-arid Sminja Plain, Tunisia (480 km²). Utilising data from 2019 to 2023 across all seasons (autumn, spring, summer, and winter), we generated TM through the Google Earth Engine (GEE) and THEIA platforms. This comparative evaluation highlighted the importance of platform selection and seasonal considerations in remote sensing-based soil property predictions. Non-vegetated soils were isolated using thresholds of NDVI &lt; 0.35 and NBR2 &lt; 0.09 to maximise non-vegetated soil extraction. Key soil properties analysed through a dataset of 215 sample locations regularly spread over the area included electrical conductivity (EC), soil organic carbon (SOC), pH, base saturation (BS), granulometric fractions, and soil moisture content. Random forest (RF) models with K-fold cross-validation assessed the predictive performance, evaluated using RMSE, RPD, and RPIQ metrics. Results indicate that both GEE and THEIA platforms effectively predicted (with THEIA having a very slight edge) most of the soil properties (SOC, CaCO₃, Ca, base saturation, granulometric fractions, and soil moisture content) with RPIQ values exceeding 1.7, while predictions for pH, EC, K, Na, and P₂O₅ were poorly reliable with RPIQ &lt; 0.8. This pinpointed the limitations of the generated RF models for certain soil properties in such environments. Seasonal variations slightly influenced model accuracy, underscoring the importance of platform selection and temporal considerations in remote sensing-based soil property prediction. These findings offer valuable insights for sustainable land management and agricultural planning in semi-arid regions.
ano.nymous@ccsd.cnrs.fr.invalid (Mukhtar Adamu Abubakar) 26 Sep 2025
https://hal.science/hal-05285538v1
-
[hal-05260506] Bare soil mosaicking optimisation for soil organic carbon prediction in Centre-Val de Loire
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Qianqian Chen) 15 Sep 2025
https://hal.inrae.fr/hal-05260506v1
-
[hal-05263772] Uncovering the role of phenotypic traits in shaping the genetic structure of alfalfa
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Irving Arcia Ruiz) 16 Sep 2025
https://hal.inrae.fr/hal-05263772v1
-
[hal-05263786] Genetic and phenotypic diversity of lucerne (Medicago sativa) for optimising its role as a living mulch in agroecological systems
<div><p>Lucerne (Medicago sativa), a drought-tolerant forage legume, is increasingly used in agroecological systems as a service plant, particularly in intercropping with annual crops such as cereals. As a perennial crop, lucerne forms a living mulch that potentially offers multiple benefits, including weed control, nitrogen enrichment, and support for reduced tillage practices, thus saving energy consumption and preserving biodiversity. However, recent studies have highlighted a significant challenge: current lucerne varieties, selected for forage production, are overly competitive and negatively impact the productivity of interplanted crops such as wheat.</p><p>This study aimed at analysing the genetic and phenotypic diversity within the M. sativa complex to identify traits that enhance lucerne's effectiveness as a living mulch, focusing on competition for light and nitrogen among lucerne, wheat, and weeds, and later their genetic determinism.</p><p>Thirty diverse lucerne accessions, representing different subspecies, autumn dormancy levels, and plant architectures (ranging from prostrate to erect forms), were evaluated. In the first phase of the study, the effects of lucerne dormancy and growth habit on wheat dominance during early stages and weed abundance were assessed. Later in the season, at the wheat heading stage, the impact of lucerne height and lodging on wheat biomass and nitrogen status was evaluated. In the second phase, forty plants of each accession were genotyped using Genotyping-by-Sequencing (GBS) and phenotyped in a nursery for plant height, growth habit, and lodging susceptibility. Genetic variance was estimated using the REML model, and broadsense heritability was calculated. Genome-Wide Association Studies (GWAS) with a Multi-Locus Mixed Model (MLMM) were used to identify candidate QTLs.</p><p>The results suggest that lucerne varieties with slow growth, moderate height, and low lodging are most effective as living mulches. However, no variety in the panel exhibited all these desirable traits, requiring dedicated breeding programmes to create this ideotype. A total of 100K SNPs covering all eight lucerne chromosomes were identified. Genetic structure analysis revealed distinct separation between wild and cultivated forms, as well as between the falcata and sativa subspecies. A high heritability of the traits was observed, varying from 0.47 for lodging to 0.69 for growth habit. Variation around the correlations between traits suggested that it is possible to combine favourable traits in a single variety. QTLs were obtained; they could be used to speed up the genetic progress in breeding programmes.</p><p>These findings provide a foundation for the genetic improvement of lucerne as a living mulch, contributing to effective and sustainable agricultural practices. Further analyses are ongoing to determine how to introduce molecular markers in the selection of lucerne varieties adapted to living mulch use.</p></div>
ano.nymous@ccsd.cnrs.fr.invalid (Zineb El Ghazzal) 16 Sep 2025
https://hal.inrae.fr/hal-05263786v1
-
[hal-05265025] Enhanced Genome Assemblies of French-Bred Dactylis glomerata and Medicago sativa: Achieving High-Quality Tetraploid Genomes
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Marie Pegard) 17 Sep 2025
https://hal.inrae.fr/hal-05265025v1
-
[hal-05262597] How can digital technology use and innovation contribute to sustainable transformation of business models in the agri-food sector?
The expectations of digital technologies in sustainable agricultural development are considerable. However, applying these technologies in agri-food value chains can have downsides, which are still barely studied. The main objectives of this systematic literature review were to discover the state of the art of the research in the use of digital technologies in business models contributing to sustainability in the agri-food sector, and to make recommendations for future research and management practice. In order to bring concepts together, develop a theoretical framework and advance knowledge, performing a literature review is conducive. This review worked with the commonly-used PRISMA-method to develop a systematic literature review. From this review, an overview of factors of digitalisation in business models of agri-food value chains were distinguished. Key themes that were found in the literature were the effects of COVID-19 on digitalisation and business resilience, the sustainability of business models in economic sense, and the importance of communication technologies in agri-food value chains. This paper argues that even though digital technologies can enhance social interaction, the human element can be lost in the process. Even if one business makes successful use of digital technologies, other actors in local and international value chains might not profit. The paper recommends for future research and management practice to use a framework that looks through a value co-creation and open innovation perspective to both the business model level and the interaction between (sustainable) business models in local and global food systems.
ano.nymous@ccsd.cnrs.fr.invalid (Laura Eline Slot) 16 Sep 2025
https://hal.inrae.fr/hal-05262597v1
-
[hal-05193281] Developing a microfluidic qPCR chip to quantify microbial taxa with a potential biocontrol activity against grapevine downy mildew
Grapevine downy mildew, caused by the oomycete Plasmopara viticola, is responsible for significant economic losses each year and for a large proportion of the fungicides used in viticulture. In order to limit the use of these chemical pesticides, which are incompatible with the development of sustainable viticulture, biocontrol solutions based on cultivated simplified communities of microorganisms (SimComs) are gradually emerging. In the present study, we designed several SimComs for the control of downy mildew, using a collection of microorganisms isolated from grapevine leaves by a culturomic approach. The SynComs composed of bacteria, yeasts and filamentous fungi, either described to have either a biocontrol activity against plant pathogens or abundant on grapevine leaves. We tested the hypothesis that including abundant species in the SimComs would help the microbial community colonize the leaves. Materials and methods A quantitative PCR microfluidic chip (Fluidigm Biomark) was developed to monitor the establishment of SinComs on grapevine leaves. Larger number of reaction are allowed by microfluidic PCR compared to classical qPCR, resulting in quicker and cheaper price per sample. It also has the advantage of providing absolute abundance data compared to metabarcoding approaches that only estimate the relative abundance of microbial taxa. Results So far, specific primers for 34 microbial taxa (out of the 42 selected for inclusion in the SimComs) have been designed in single-copy housekeeping genes. We are currently sequencing the genomes of the remaining microbial taxa to complete the primer design. We applied the microfluidic chip to DNA samples extracted from grapevine leaf discs inoculated with SimComs and were able to detect most of the inoculated microorganisms, including some microbial taxa that significantly reduced the intensity of downy mildew symptoms under laboratory conditions. The microfluidic chip was then applied to environmental DNA collected in vineyard from spore sensor, in order to detect and quantify the targeted protective microorganisms. By doing so, we were able to detect the presence of several microorganisms, including some microbial taxa with proven biocontrol activity against plant pathogens such as Bacillus pumilus, Aureobasidium pullulans and Epicoccum nigrum. Conclusion : These preliminary results shed light on the potential of microfluidic chip as a new molecular diagnostic tool to monitor specific microbial communities present naturally or artificially after SimCom inoculation in the field.
ano.nymous@ccsd.cnrs.fr.invalid (Manon Chargy) 30 Jul 2025
https://hal.science/hal-05193281v1
-
[hal-05161584] Cross-Species Predictions of Chromatin Annotations using Neural Networks
A better knowledge of functional annotations of livestock species can be a lever to link genome to phenome. The genomes of most livestock species have already been sequenced. However, data describing gene regulation mechanisms and chromatin state are insufficient. In contrast, abundant human and mouse data allowed the training of powerful deep learning algorithms. Here, we propose to use 3 artificial neural networks (Deepbind, DeepSEA and Enformer), trained with human and mouse data, to predict annotations on the pig, cattle, chicken and European seabass genomes. The predictions are then compared with experimental data to evaluate the cross-species performance of the neural networks. First, human-trained neural network predictions performed on the mouse reference genome showed varying levels of accuracy depending on the experiment, with the higher performance for H3K4me3 (auPRC=0.624). Second, the predictions on the pig, cattle and chicken genomes showed similar (lower mean auPRC=0.385+/-0.233) and better performances than those on the seabass genome (mean auPRC=0.144+/-0.096). Third, the evaluation of the impact of genomic features on the predictions highlighted better performances for CpG island and 5'UTR than other features. Finally, the comparison of predictions between different pig breeds with high genetic diversity demonstrated that genetic variability does not affect the performance, but rather observations. To conclude, we showed that the 3 neural networks evaluated can be used to predict annotations on non-mammalian genomes with similar performances (chicken), but not on genomes of organisms phylogenetically too distant (seabass).
ano.nymous@ccsd.cnrs.fr.invalid (Noémien Maillard) 14 Jul 2025
https://hal.inrae.fr/hal-05161584v1
-
[hal-05235924] LIPH4SAS : The French nationnally distributed research infrastructure for livestock phenotyping
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Jean Pierre Bidanel) 02 Sep 2025
https://hal.inrae.fr/hal-05235924v1
-
[hal-05161904] Generation of metabolomic-informed models of metabolism for microbial communities
The generation of genome-wide metabolic networks has become a routine analysis for individual organisms or communities communities. However, these automatically generated metabolic networks are incomplete because they are constructed by based on the combination of gene annotation and reactions available in generic available in generic databases (Metacyc, BIGG, ModelSEED...). These are oriented towards well-known organisms or organisms or model organisms and miss out on important functions secondary metabolism. We propose to combine metabolomic data analysis, metabolic modelling and annotation metabolic modelling and annotation mining to build high-quality models of high quality models of microbial metabolism with the long-term aim of better understanding of microbial communities. In terms of application of the methods to plant microbial communities, we hope that the plant microbial communities, we hope that the newly developed models will provide a better understanding of the process of microbial recruitment by the plant: metabolic functions involved, micro-organisms associated with these functions.
ano.nymous@ccsd.cnrs.fr.invalid (Coralie Muller) 15 Jul 2025
https://inria.hal.science/hal-05161904v1
-
[hal-05163368] Statistical method inference cMFA for multi-omics data integration in microbial community models
Understanding microbial community functions is challenging due to complex interactions and assembly mechanisms; however, advances in sequencing have enabled the collection of multi-omics data, including population counts and metabolomic or metatranscriptomic data. Our main objective is to develop a mathematical model capable of integrating time series of multiomics data at a community scale. We introduce the community metabolic flux analysis (cMFA) method, which generalizes metabolic flux analyses,, using a list of time series data of experimentally measured production and consumption rates of metabolites and microorganism growth. We aim to infer, for each member of the microbial community, the intracellular distribution of metabolic fluxes. This is a high-dimensional constrained linear regression problem, informed by mass conservation constraints and metatranscriptomic data, encoded in the penalty term. The difficulty here is in accurately inferring latent internal rates from a few observations of exchange fluxes. We evaluated the cMFA method on synthetic data from dynamic models of increasingly complex microbial communities, based on metabolic models of different mutants of Escherichia coli using dynamic flux balance analysis (dFBA). Synthetic metatranscriptomic data were obtained from internal metabolic fluxes in the dynamic model. Different regularization terms were tested, including different levels of sparsity, for the selected penalty weight . To evaluate the robustness of the method, multiple benchmarks were tested. These included assessments of the robustness of the method to data noise, incomplete meta-transcriptomic data, and inaccurate prior knowledge of metabolic import rates. Currently, we are working with real data and expanding the study to a larger microbial community.
ano.nymous@ccsd.cnrs.fr.invalid (Sthyve Junior Tatho Djeanou) 15 Jul 2025
https://inria.hal.science/hal-05163368v1
-
[hal-05194387] Bread wheat pangenome Graphs
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Pauline Lasserre) 31 Jul 2025
https://hal.inrae.fr/hal-05194387v1
-
[hal-05086840] Spatiotemporal modeling of host–pathogen interactions using level-set method
<div><p>Phenotyping host-pathogen interactions is crucial for understanding infectious diseases in plants. Traditionally, this process has relied on visual assessments or manual measurements, which can be subjective and labor-intensive. Recent advances in image processing and mathematical modeling enable the precise and high-throughput phenotyping of plant symptoms. Among many challenges, considering local deformations of symptoms and host tissues is difficult in plant pathology. In this study, we address this question using a level-set method. We propose an innovative approach in plant pathology that allows one to reconstruct the continuous deformation of leaf and lesion contours from daily image sequences of inoculated leaves. We consider pea stipules inoculated by the fungal pathogen Peyronellaea pinodes as an example pathosystem. After extracting lesion and stipule contours from daily visible images, we use the level-set method to track their deformations within image sequences. The visual assessment of model adequacy, along with the Jaccard Index and relative error metrics, demonstrated strong overall performance. Results showed a gradual decrease in model accuracy over time for leaf contours, while lesion contours exhibited a higher relative error on the first targeted date. These findings highlight the robustness of our method while identifying specific challenges in early lesion detection. We finish by discussing the interest in this method based on partial differential equations for the study of host-pathogen interactions, especially the development of original phenotyping methods in plant pathology.</p></div>
ano.nymous@ccsd.cnrs.fr.invalid (Sheila Rae Permanes) 02 Jun 2025
https://hal.inrae.fr/hal-05086840v2
-
[hal-05117608] The future of systems genetics in farm animal sciences
Farm animal species are under intense selection on relatively small population sizes. Genetic and genomic selection has provided remarkable genetic gains in the last century. Nevertheless, current methods aiming to link genome to phenome in such populations remain limited, notably due to the difficulty to identify causal variants for complex traits. The diversity of species as well as breeds in livestock has diluted the number of genomic datasets available for each genome as compared to model organisms or human diseases. In this article we propose a systems genetics approach as an opportunity to go beyond current limits, taking advantage of novel computational development allowing integration of omics datasets from different analyses across species. A major challenge is that systems genetics requires careful but efficient data and metadata management, as well as rigorous statistical and strategies on which approach to use. Here, we highlight examples of the broad contribution systems genetics can bring to farm animal sciences, particularly across species, notably in the genome-to-phenome field within the larger scope of agricultural challenges including adaptation to environmental changes and animal welfare.
ano.nymous@ccsd.cnrs.fr.invalid (Guillaume Devailly) 17 Jun 2025
https://hal.science/hal-05117608v1
-
[hal-05105798] Inference Method cMFA for multi-omics data integration in microbial community models
Understanding microbial community functions is challenging due to complex interactions and assembly mechanisms. However, advances in sequencing technologies have enabled the collection of multi-omics data, including population counts and metabolomic or metatranscriptomic profiles. Our main objective is to develop a mathematical model capable of integrating time series of multi-omics data at the community scale. We introduce the community Metabolic Flux Analysis (cMFA) method: a biology-informed inference approach that generalizes classical Metabolic Flux Analysis. This high-dimensional analytical framework aims to estimate metabolic fluxes by integrating multi-omics data. Specifically, we aim to (i) quantify, for each member of the microbial community, their individual contributions to overall community dynamics based on external measurements of metabolite dynamics, and (ii) infer their intracellular distribution of metabolic fluxes. The difficulty here is in accurately inferring latent internal rates from a few observations of community-scale consumption and production rates for extracellular metabolites. We evaluated the cMFA method using synthetic data generated from dynamic models of microbial communities of increasing complexity using dynamic flux balance analysis, based on metabolic models of different Escherichia coli mutants. Synthetic metatranscriptomic data were obtained from internal metabolic fluxes simulated in the dynamic model. To assess the robustness of the method, we benchmarked its performance under varying levels of experimental noise.
ano.nymous@ccsd.cnrs.fr.invalid (Sthyve Junior Tatho Djeanou) 10 Jun 2025
https://hal.science/hal-05105798v1
-
[hal-05099666] Lucerne genetic diversity for living mulch: identifying key traits and evaluating their impacts on wheat development
Context Lucerne (Medicago sativa) can offer ecosystem services as a perennial living mulch, supporting annual cash crops through weed suppression and nitrogen fixation. However, trials with wheat have shown that current lucerne varieties are excessively competitive, leading to reduced wheat yields. Aims This study aimed to analyse the diversity within the M. sativa complex to identify traits that enhance lucerne effectiveness as a living mulch, focusing on the competition for light and nitrogen among lucerne, wheat and weeds. Methods Thirty diverse lucerne accessions were cultivated as living mulch with a winter wheat, over 2 years. Lucerne dormancy and growth habit effects were evaluated on wheat relative dominance during the early stages and on weed abundance. In later stages, the effects of lucerne height and lodging on wheat biomass and nitrogen status were also assessed. Key results Results indicated that lucerne dormancy and growth habit influenced wheat growth during early stages, with dormant and prostrate lucerne accessions reducing competition and enhancing wheat dominance. However, non-dormant and erect lucerne accessions effectively suppressed weeds but competed intensely with wheat. Tall and erect lucerne accessions supported wheat nitrogen status in the second year only. Lucerne lodging affected wheat growth, with tall lucerne reducing wheat biomass in the first year. Conclusions Lucerne should exhibit slow growth, moderate height, and low lodging to optimise its benefits. No variety in our panel exhibited all these desirable traits. Implications These findings highlight the need for breeding programs to combine lucerne beneficial traits as a living mulch into new varieties.
ano.nymous@ccsd.cnrs.fr.invalid (Zineb El Ghazzal) 05 Jun 2025
https://hal.inrae.fr/hal-05099666v1
-
[hal-05094925] Guide simplifié - Gestion des données de vos recherches à l'usage des porteurs de projets
Message à destination des porteurs de projets. Ce document vise à synthétiser les recommandations pour une meilleure gestion des données des projets de recherche. Ceci est ma proposition pour vous faciliter la vie. Quatre pages pour balayer les différents aspects de la gestion des données suivies d'une page de ressources, aucune excuse pour ne pas les lire ! Vous pouvez contribuer à son amélioration en nous faisant des retours. Gestion des données de vos recherches à l'usage des porteurs de projets
ano.nymous@ccsd.cnrs.fr.invalid (Frédéric de Lamotte) 03 Jun 2025
https://hal.inrae.fr/hal-05094925v1
-
[hal-05288241] Mapler: a pipeline for assessing assembly quality in taxonomically rich metagenomes sequenced with HiFi reads
Abstract Summary Metagenome assembly seeks to reconstruct the most high-quality genomes from sequencing data of microbial ecosystems. Despite technological advancements that facilitate assembly, such as Hi-Fi long reads, the process remains challenging in complex environmental samples consisting of hundreds to thousands of populations. Mapler is a metagenome assembly and evaluation pipeline with a focus on evaluating the quality of Hi-Fi long read metagenome assemblies. It incorporates several state-of-the-art metrics, as well as novel metrics assessing the diversity that remains uncaptured by the assembly process. Mapler facilitates the comparison of assembly strategies and helps identify methodological bottlenecks that hinder genome reconstruction. Availability and implementation Mapler is open source and publicly available under the AGPL-3.0 licence at https://github.com/Nimauric/Mapler. Source code is implemented in Python and Bash as a Snakemake pipeline. A snapshot of the code is available on Software Heritage at swh:1:snp:df4f5f02e22ebbab285ec14af58d4d88436ee5d6. Raw data and results are available at https://entrepot.recherche.data.gouv.fr/dataset.xhtml?persistentId=doi:10.57745/2SA8AB.
ano.nymous@ccsd.cnrs.fr.invalid (Nicolas Maurice) 29 Sep 2025
https://hal.science/hal-05288241v1
-
[hal-05105572] cMFA Inference method for multi-omics data integration in microbial community models
Understanding the functioning of microbial communities is challenging due to the complexity of their interactions and assembly mechanisms. However, advances in sequencing technologies have enabled the collection of multi-omics data, including population counts and metabolomic or metatranscriptomic profiles. Our main objective is to develop a mathematical model capable of integrating time series of multi-omics data at the community scale. We introduce the community Metabolic Flux Analysis (cMFA) method: a biology-informed inference approach that generalizes classical Metabolic Flux Analysis. This high-dimensional analytical framework aims to estimate metabolic fluxes by integrating multi-omics data. Specifically, we aim to (i) quantify, for each member of the microbial community, their individual contributions to overall community dynamics based on external measurements of metabolite dynamics, and (ii) infer their intracellular distribution of metabolic fluxes. The difficulty here is in accurately inferring latent internal rates from a few observations of community-scale consumption and production rates for extracellular metabolites. We evaluated the cMFA method using synthetic data generated from dynamic models of microbial communities of increasing complexity using dynamic flux balance analysis, based on metabolic models of different Escherichia coli mutants. Synthetic metatranscriptomic data were obtained from internal metabolic fluxes simulated in the dynamic model. To evaluate the robustness of the method, multiple benchmarks were tested. These included assessments of the robustness of the method to data noise,incomplete meta-transcriptomic data, inaccurate prior knowledge of metabolic import rates, and larger of microbial communities. We are currently finalizing various benchmarks and working with real experimental data.
ano.nymous@ccsd.cnrs.fr.invalid (Sthyve Junior Tatho Djeanou) 10 Jun 2025
https://hal.science/hal-05105572v1
-
[hal-05086556] De nouveaux outils et algorithmes au service de la mesure du bien-être des animaux face au changement climatique – exemples chez les bovins
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Florence Gondret) 27 May 2025
https://hal.inrae.fr/hal-05086556v1
-
[hal-05073304] Generation of metabolomic-informed models of metabolism in complex microbial communities
The generation of genome-wide metabolic networks has become a routine analysis for individual organisms or communities communities. However, these automatically generated metabolic networks are incomplete because they are constructed by based on the combination of gene annotation and reactions available in generic available in generic databases (Metacyc, BIGG, ModelSEED...). These are oriented towards well-known organisms or organisms or model organisms and miss out on important functions secondary metabolism. We propose to combine metabolomic data analysis, metabolic modelling and annotation metabolic modelling and annotation mining to build high-quality models of high quality models of microbial metabolism with the long-term aim of better understanding of microbial communities. In terms of application of the methods to plant microbial communities, we hope that the plant microbial communities, we hope that the newly developed models will provide a better understanding of the process of microbial recruitment by the plant: metabolic functions involved, micro-organisms associated with these functions.
ano.nymous@ccsd.cnrs.fr.invalid (Coralie Muller) 19 May 2025
https://inria.hal.science/hal-05073304v1
-
[hal-05089855] Hierarchical Long Short Term Memory Recurrent Neural Network for Goats Behaviour Prediction from Accelerometer Data
Gastrointestinal parasitism is a major challenge in small grazing ruminants, affecting animal welfare and farmers’ income. In this regard, monitoring individual animal behaviour could help to develop new selection schemes that favor animals with a lower risk of larval infestation, but also support the targeted use of anthelmintic by focusing only on infested animals. Accelerometer sensors are widely used in combination with statistical models to predict the behaviour of grazing ruminants, but the lack of generalization of the models and the limited range of well-predicted behaviours are still challenging. In our study, we introduce an innovative methodology based on hierarchical long short term memory (LSTM) recurrent neural networks to predict the main behaviours of goats on pasture. For that purpose, we collected accelerometer data from the horns of 59 Creole goats and annotated the behaviour over 144 hours of data. We defined 13 moving features that are mathematical combinations of the raw data to get more information while preserving the temporal structure of the accelerometer time-series. A data augmentation technique involving the addition of random noise was applied to sequences from the minority behaviour labels. A hierarchical LSTM model was then built to derive behaviours from a given accelerometer signal, by sequentially combining several models that first tackle simple classification tasks (e.g., grazing or non-grazing segments), then increasingly complex ones (e.g., displacement or other activities), progressively withdrawing segments that have already been identified. The hierarchical LSTM model was validated using a testing set consisting of goats not seen during model training, and carefully selected to maximize behavioural labels heterogeneity. Performance of the hierarchical LSTM model was also compared to those of a regular LSTM model which directly classified the raw signal into the 5 behaviours, used as the baseline. Highest performance was obtained with the hierarchical LSTM model, reaching a Fscore of 87.84%, a precision of 89.44% and a recall of 86.3%. Best performance was obtained for grazing prediction (recall: 99.5%; precision: 99.4%), following by resting (recall: 98.0%; precision: 98.4%) and ruminating (recall: 95.2%; precision: 89%). Most confusions occurred with the displacement (recall: 51.2%; precision: 67.8%), likely due to the low number of sequences in the dataset (0.42% of the dataset). While other avenues remain to be explored for improving the prediction of such rare behaviours, our approach introduces key innovations that not only address the methodological limitations identified in the literature, but also facilitate further exploration of the role of goat behaviour in managing gastrointestinal parasitism on pasture.
ano.nymous@ccsd.cnrs.fr.invalid (Mathieu Bonneau) 02 Jun 2025
https://hal.inrae.fr/hal-05089855v1
-
[hal-04963112] 3-D shape control of deformable linear objects for branch handling using an adaptive Lyapunov-based scheme
Despite its various applications, robotic manipulation of deformable objects in agriculture has experienced limited development so far. This is due to the specific challenges in this domain, i.e., the variety of objects in this field is wide, and the deformation properties of the objects cannot be easily recognized in advance. In addition, deformable objects generally have complex dynamics and high-dimensional configuration space. In this paper, the manipulation of deformable linear objects (DLOs) is addressed by considering these challenges. Concretely, a new indirect adaptive control method is proposed to manipulate DLOs by controlling their shape in 3-D space towards previously defined targets, with a specific focus on agricultural applications. The proposed method can follow a desired dynamic evolution of the shape with a smooth deformation that brings about a stable gripper motion. This property of the method can protect the object from possible damages, even under large deformations, which is crucial in agricultural scenarios. An adaptation law is leveraged for estimating the system parameters, and Lyapunov analysis is employed to study the validity of the proposed control scheme. The scheme can be applied to diverse objects that can be modeled as linear, including tree branches or other rod-like structures. The effectiveness of the scheme is demonstrated through various experiments where, using shape feedback obtained from a 3-D camera, a robotic arm controls the shape of a flexible foam rod and of branches of different plants. © 2025
ano.nymous@ccsd.cnrs.fr.invalid (Omid Aghajanzadeh) 24 Feb 2025
https://hal.inrae.fr/hal-04963112v1
-
[hal-05035257] rs-pancat-compare
Program that calculates the distance between two GFA (Graphical Fragment Assembly) files. It takes in the file paths of the two GFA files. The program first identifies the common paths between the two graphs by finding the intersection of their path names. For each common path, the program reads those and output differences in segmentation in-between them. The purpose is to output the necessary operations (merges and splits) required to transform the graph represented by the first GFA file into the graph represented by the second GFA file.
ano.nymous@ccsd.cnrs.fr.invalid (Siegfried Dubois) 15 Apr 2025
https://hal.science/hal-05035257v1
-
[hal-05053970] Comparing three classification methods for plants and plant parts consumed by small ruminants in Mediterranean rangelands
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Thomas Dochier) 02 May 2025
https://hal.inrae.fr/hal-05053970v1
-
[hal-05026689] Des problèmes de coopération dans la gestion de Communs aux organisations biosociales
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Julie Labatut) 09 Apr 2025
https://hal.inrae.fr/hal-05026689v1
-
[hal-05026612] Repenser les organisations comme des devenirs biosociaux
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Julie Labatut) 09 Apr 2025
https://hal.inrae.fr/hal-05026612v1
-
[hal-05163083] Bare soil mosaicking optimisation for soil organic carbon prediction in Centre-Val de Loire
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Qianqian Chen) 15 Jul 2025
https://hal.science/hal-05163083v1
-
[hal-05066058] Assessing the impact of incorporating soil moisture and roughness as co-variables to improve soil organic carbon content prediction from hyperspectral data
Predicting soil organic carbon (SOC) content is of critical importance for both environmental policies and monitoring issues [Criscuoli et al., 2024]. Traditional field soil sampling methods are tedious and cost-prohibitive, making remote sensing an attractive alternative for simplifying estimates, particularly at large scales. The EnMAP satellite offers great potential, as it captures a broad spectral range and high spectral resolution, both essential for accurately assessing SOC content [Chabrillat et al., 2023]. However, remote sensing acquisitions face many challenges, and optimal soil surface conditions are rarely achieved. Soils are often not fully bare; they may be moist and/or rough, which significantly impacts reflectance and, consequently, the ability to predict SOC content. Only few studies accounted for soil roughness [Denis et al., 2014; Piekarczyk et al., 2016] and this in isolation, from soil moisture, which was primarily accounted for under controlled laboratory conditions [see review by Knadel et al., 2023]. This study aims to evaluate the benefit of incorporating co-variables related to soil moisture and surface roughness into SOC prediction models based on field spectroscopy and other in situ simultaneous measurements. A joint objective is to assess whether using EnMAP simulations of these spectra can improve model performance compared to other selections of specific wavelengths.
ano.nymous@ccsd.cnrs.fr.invalid (Hugues Merlet) 13 May 2025
https://hal.science/hal-05066058v1
-
[hal-04993468] CabriTrack: Accelerometer data for automated behavioural monitoring of grazing Creole goats
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Laura Faillot) 17 Mar 2025
https://hal.inrae.fr/hal-04993468v1
-
[hal-05175887] Suivi satellitaire du C du sol – potentialités d’application au secteur viticole
Cette communication porte sur les objectifs du projet MELICERTES ainsi que les principes généraux pour le suivi spectral du C, puis sur les premiers résultats issus de séries satellitaires. Elle aborde ensuite les acquis antérieurs en vignobles et applications au secteur viticole, puis les perspectives d'application au secteur viticole et les travaux en cours dans le cadre du projet SANCHOSTHIRST d'EJP SOIL.
ano.nymous@ccsd.cnrs.fr.invalid (Emmanuelle Vaudour) 22 Jul 2025
https://hal.science/hal-05175887v1
-
[hal-04996131] Comment créer des variétés moins gourmandes en pesticides pour l’arboriculture? Exemple des recherches sur l’abricotier et le pêcher à INRAE
[...]
ano.nymous@ccsd.cnrs.fr.invalid (Morgane Roth) 18 Mar 2025
https://hal.science/hal-04996131v1
-
[hal-04993862] Mapler: Assessing assembly quality in taxonomically rich metagenomes sequenced with HiFi reads
Metagenome assembly seeks to reconstruct the most high-quality genomes from sequencing data of microbial ecosystems. Despite technological advancements that facilitate assembly, such as Hi-Fi long reads, the process remains challenging in complex environmental samples consisting of hundreds to thousands of populations. Mapler is a metagenome assembly and evaluation pipeline with a focus on evaluating the quality of Hi-Fi long read metagenome assemblies. It incorporates several state-of-the-art metrics, as well as novel metrics assessing the diversity that remains uncaptured by the assembly process. Mapler facilitates the comparison of assembly strategies and helps identify methodological bottlenecks that hinder genome reconstruction.
ano.nymous@ccsd.cnrs.fr.invalid (Nicolas Maurice) 18 Mar 2025
https://hal.science/hal-04993862v1
-
[hal-05006930] How can digitalisation contribute to sustainability of business models in agri-food value chains? A systematic literature review
The expectations of digital technologies in sustainable agricultural development are considerable. However, applying these technologies in agri-food value chains can have downsides, which are still barely studied. The main objectives of this systematic literature review were to discover the state of the art of the research in the use of digital technologies in business models contributing to sustainability in the agri-food sector, and to make recommendations for future research and management practice. In order to bring concepts together and develop a theoretical framework and advance knowledge, performing a literature review is conducive. Here, the commonly-used PRISMA-method was used to develop a systematic literature review. From this review, an overview of business model innovations, and drivers, benefits and drawbacks of digitalisation in agri-food value chains were distinguished. Key themes found in the literature were the effects of COVID-19 on digitalisation and business resilience, the economic sustainability of business models, and the importance of communication technologies in agri-food value chains. This article recommends for future research and management practice to use a framework that looks through a value co-creation and open innovation perspective to the individual business model level and the interaction between (sustainable) business models in local and global food systems.
ano.nymous@ccsd.cnrs.fr.invalid (Laura Eline Slot) 26 Mar 2025
https://hal.inrae.fr/hal-05006930v1
-
[hal-04996155] cMFA for multi-omics data integration in microbial community models
Microbial communities are an essential component of plant health, helping in nutrient acquisition and defense against pathogens. Despite their importance, the mechanisms behind their assembly and regulation remain poorly understood. Advances in sequencing and measuring technologies have enabled the collection of multi-omics data, including population counts on the abundance of microorganisms, metabolomic data on metabolite consumption and production, and metatranscriptomic data on gene activity within these communities. In order to answer the question of how these microorganisms function in the community and interact with one another, our main objective is to develop a mathematical model of dynamic systems capable of integrating these time series of multi-omics data at a community scale. Such a model will help to better decipher the functioning of the microbial community and understand its composition, knowing what each individual consume and produces. To achieve this goal, we introduce the community-scale metabolic flux analysis (cMFA) method. In this poster, we introduced the cMFA method, that we assessed on synthetic data from a dynamic model of increasingly complex microbial communities, built upon metabolic models of microorganisms. The observed growth rates were obtained from the spline smoothing of several replicates of the community dynamics. Synthetic meta-transcriptomic data were produced from metabolic fluxes in the dynamic model. Different regularization terms were tested, including different levels of sparsity, for a cross-validated penalty weight. The cMFA method, implemented in Python with OSQP, a software package dedicated to quadratic programming problems, allows for the recovery of the functioning of microbial individuals from multi-omics data acquired at the community scale during growth experiments.
ano.nymous@ccsd.cnrs.fr.invalid (Sthyve Junior Tatho Djeanou) 18 Mar 2025
https://inria.hal.science/hal-04996155v1
-
[hal-05008533] Inferring Kernel ϵ-Machines: Discovering Structure in Complex Systems
<div><p>Previously, we showed that computational mechanic's causal states-predictively-equivalent trajectory classes for a stochastic dynamical system-can be cast into a reproducing kernel Hilbert space. The result is a widely-applicable method that infers causal structure directly from very different kinds of observations and systems. Here, we expand this method to explicitly introduce the causal diffusion components it produces. These encode the kernel causal-state estimates as a set of coordinates in a reduced dimension space. We show how each component extracts predictive features from data and demonstrate their application on four examples: first, a simple pendulum-an exactly solvable system; second, a molecular-dynamic trajectory of n-butane-a high-dimensional system with a well-studied energy landscape; third, the monthly sunspot sequence-the longest-running available time series of direct observations; and fourth, multi-year observations of an active crop field-a set of heterogeneous observations of the same ecosystem taken for over a decade. In this way, we demonstrate that the empirical kernel causal-states algorithm robustly discovers predictive structures for systems with widely varying dimensionality and stochasticity.</p><p>Science progresses by discovering new structures and behaviors in the natural world. However, decades of success in nonlinear dynamics have driven home the message that systems in the world are nonlinear and high dimensional. Moreover, appropriately representing their emergent complexity and stochasticity makes structure discovery quite challenging. We recently introduced a discovery algorithm that learns optimal predictive features from measurement data as components in a reproducing kernel Hilbert space. The algorithm identifies predictive features in data series consisting of disparate data types, from categorical and discrete to fractal and continuous. In this way, the methodology exploits modern advances in machine learning fundamentals, resulting in a highly flexible and practicable algorithm for finding a system's effective state space-the minimal optimally predictive model. Notably, this offers a new geometric interpretation of the predictive structure of computational mechanic's ϵ-machines. Here, we demonstrate its use on four distinct examples including both simulated and real experimental over a variety of data types.</p></div>
ano.nymous@ccsd.cnrs.fr.invalid (Alexandra Jurgens) 27 Mar 2025
https://inria.hal.science/hal-05008533v1
-
[hal-04983681] IMPO: Interpretable Memory-based Prototypical Pooling
Graph Neural Networks (GNNs) have proven their effectiveness in various graph-structured data applications. However, one of the significant challenges in the realm of GNNs is representation learning, a critical concept that bridges graph pooling, aimed at creating compressed graph representations, and explainable artificial intelligence, which focuses on building models with transparent reasoning mechanisms. This research paper introduces a novel approach called Interpretable Memory-based Prototypical Pooling (IMPO) to address this challenge. IMPO is a graph pooling layer designed to enhance the interpretability of GNNs while maintaining high performance in graph classification tasks. It builds upon the MemPool algorithm and incorporates prototypical components to cluster nodes around class-aware centroids. This approach allows IMPO to selectively aggregate relevant substructures, paving the way for generating more interpretable graph representations. The experimental results in our study underscore the potential of pooling architectures in constructing inherently explainable GNNs. Notably, IMPO achieves state-of-the-art results in both classification and explanatory capacities across a diverse set of graph classification datasets.
ano.nymous@ccsd.cnrs.fr.invalid (Alessio Ragno) 09 Mar 2025
https://hal.science/hal-04983681v1
-
[hal-04996135] cMFA for multi-omics data integration in microbial community models
Microbial communities are an essential component of plant health, helping in nutrient acquisition and defense against pathogens. Despite their importance, the mechanisms behind their assembly and regulation remain poorly understood. Advances in sequencing and measuring technologies have enabled the collection of multi-omics data, including population counts on the abundance of microorganisms, metabolomic data on metabolite consumption and production, and metatranscriptomic data on gene activity within these communities. In order to answer the question of how these microorganisms function in the community and interact with one another, our main objective is to develop a mathematical model of dynamic systems capable of integrating these time series of multi-omics data at a community scale. Such a model will help to better decipher the functioning of the microbial community and understand its composition, knowing what each individual consume and produces. To achieve this goal, we introduce the community-scale metabolic flux analysis (cMFA) method. In this poster, we introduced the cMFA method, that we assessed on synthetic data from a dynamic model of increasingly complex microbial communities, built upon metabolic models of microorganisms. The observed growth rates were obtained from the spline smoothing of several replicates of the community dynamics. Synthetic meta-transcriptomic data were produced from metabolic fluxes in the dynamic model. Different regularization terms were tested, including different levels of sparsity, for a cross-validated penalty weight. The cMFA method, implemented in Python with OSQP, a software package dedicated to quadratic programming problems, allows for the recovery of the functioning of microbial individuals from multi-omics data acquired at the community scale during growth experiments.
ano.nymous@ccsd.cnrs.fr.invalid (Sthyve Junior Tatho Djeanou) 18 Mar 2025
https://inria.hal.science/hal-04996135v1
-
[hal-05162236] GrAnnoT, a tool for efficient and reliable annotation transfer through pangenome graph
The increasing availability of genome sequences has highlighted the limitations of using a single reference genome to represent the diversity within a species. Pangenomes, encompassing the genomic information from multiple genomes, offer thus a more comprehensive representation of intraspecific diversity. However, pangenomes in form of graph often lack annotation information, which limits their utility for forward analyses. We introduce here GrAnnoT, a tool designed for efficient and reliable annotation transfer using such graphs, by projecting existing annotations from a source genome to the graph and subsequently to other embedded genomes. GrAnnoT was benchmarked against state-of-the-art tools on pangenome graphs and linear genomes from rice, human, and E. coli . The results demonstrate that GrAnnoT is consensual, conservative, and fast, outperforming alignment-based methods in accuracy or speed or both. It provides informative outputs, such as presence-absence matrices for genes, and alignments of transferred features between source and target genomes, aiding in the study of genomic variations and evolution. GrAnnoT’s robustness and replicability across different species make it a valuable tool for enhancing pangenome analyses. GrAnnoT is available under the GNU GPLv3 licence at https://forge.ird.fr/diade/dynadiv/grannot .
ano.nymous@ccsd.cnrs.fr.invalid (Nina Marthe) 15 Jul 2025
https://hal.science/hal-05162236v1