Slack Export - #dir-environmental

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-21 13:54:45

@James Fellows Yates has joined the channel

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-21 13:54:55

@Becky Cribdon has joined the channel

Pete Heintzman (peteheintzman@gmail.com)

2020-08-21 13:54:55

@Pete Heintzman has joined the channel

Anneke ter Schure (a.t.m.t.schure@ibv.uio.no)

2020-08-21 13:54:55

@Anneke ter Schure has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-21 13:56:32

@Pete Heintzman @Anneke ter Schure I've added you both as you've said you work on sedaDNA in the past, but feel free to leave if you're not interested. But @Becky Cribdon is looking for some help reviewing some papers for AncientMetagenomeDir to check the metadata she's added is correct. As you both joined the AncientMetagenomeDir channel I'm assuming you may have been interesting in helping out. Would either of you have time to help Becky?

Pete Heintzman (peteheintzman@gmail.com)

2020-08-21 17:27:31

Hi @James Fellows Yates and @Becky Cribdon, Sure - can devote some time to this next week. What needs doing, Becky?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-21 19:17:34

Becky needs a review on: https://github.com/SPAAM-workshop/AncientMetagenomeDir/pull/162

However I will need to add to our spaam github organisation. If you send me your username I will add it, and they Becky can take it from there!

} Becky-Cribdon (https://github.com/Becky-Cribdon)

#162 Slon2017

Pull Request This PR is for a ☑︎ <#new-publication|New Publication(s)> ☐ <#correction|Correction> For the follow list (s) ☐ ancientmetagenome-hostassociated ☐ ancientsinglegenome-hostassociated ☑︎ ancientmetagenome-environmental ☐ ancientmetagenome-anthropogenic New Publication Publication Information This pull request is to add samples from the following publication(s): Slon 2017 10.1126/science.aam9695 to close <a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/issues/92">#92</a> Checklist ☑︎ Publication is published (preprints currently not accepted)? ☑︎ Checked the publication is not already in the database? ☑︎ Checked samples in this publication are not previously published data (newly re-sequenced metagenomes are OK!)? ☑︎ Samples are shotgun metagenomes (hostassociated-singlegenome may also contain whole-genome enriched data)? ☑︎ Checked the list follows conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA)? ☐ (If applicable) Updated the JSON files under <code>/assets/enums</code> with new categories (e.g. material or archive)? ☑︎ Does your PR pass validation checks? ☑︎ Changelog is updated? If you do not know how to check errors in failed validation checks, expand here <ol><li>Press 'details' next to the failed check.</li><li>Expand the <code>test ancient &lt;list&gt;</code> line with the red X next to it.</li><li>Scroll to the bottom of the log, and look for a <code>DatasetValidationError</code> (usually the last line).</li><li>Read the error, and fix accordingly. Check the README for a given list for more guidance. If in doubt, ask!</li> </ol> Correction This PR is for ☐ ancientmetagenome-hostassociated ☐ ancientsinglegenome-hostassociated ☐ ancientmetagenome-environmental ☐ ancientmetagenome-anthropogenic Reference This pull request is to correct samples from the following publication(s): Description The issue is: Checklist ☐ Checked the corrected entries follow conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA) ☐ Does your PR pass validation checks? ☐ Changelog is updated?

Comments

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-21 22:49:05

Thanks Pete! Great to hear from you again.

Actually, I've just added Smith 2015, which was much simpler than the Slon samples and would make an easier first review 🙂 https://github.com/SPAAM-workshop/AncientMetagenomeDir/pull/167

} Becky-Cribdon (https://github.com/Becky-Cribdon)

#167 Smith2015

Pull Request This PR is for a ☑︎ <#new-publication|New Publication(s)> ☐ <#correction|Correction> For the follow list (s) ☐ ancientmetagenome-anthropogenic (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-anthropogenic">README</a>) ☑︎ ancientmetagenome-environmental (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-environmental">README</a>) ☐ ancientmetagenome-hostassociated (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-hostassociated">README</a>) ☐ ancientsinglegenome-hostassociated (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientsinglegenome-hostassociated">README</a>) New Publication Publication Information This pull request is to add samples from the following publication(s): Checklist ☑︎ Publication is published (preprints currently not accepted)? ☑︎ Checked the publication is not already in the database? ☑︎ Checked samples in this publication are not previously published data (newly re-sequenced metagenomes are OK!)? ☑︎ Samples are shotgun metagenomes (hostassociated-singlegenome may also contain whole-genome enriched data)? ☑︎ Checked the list follows conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA)? ☐ (If applicable) Updated the JSON files under <code>/assets/enums</code> with new categories (e.g. material or archive)? ☑︎ Does your PR pass validation checks? ☑︎ Changelog is updated? If you do not know how to check errors in failed validation checks, expand here <ol><li>Press 'details' next to the failed check.</li><li>Expand the <code>test ancient &lt;list&gt;</code> line with the red X next to it.</li><li>Scroll to the bottom of the log, and look for a <code>DatasetValidationError</code> (usually the last line).</li><li>Read the error, and fix accordingly. Check the README for a given list for more guidance. If in doubt, ask!</li> </ol> Correction This PR is for ☐ ancientmetagenome-hostassociated ☐ ancientsinglegenome-hostassociated ☐ ancientmetagenome-environmental ☐ ancientmetagenome-anthropogenic Reference This pull request is to correct samples from the following publication(s): Description The issue is: Checklist ☐ Checked the corrected entries follow conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA) ☐ Does your PR pass validation checks? ☐ Changelog is updated?

👍 James Fellows Yates, Pete Heintzman

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-23 06:29:41

@Antonio Fernandez-Guerra has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-24 10:17:00

@Pete Heintzman while I'm looking at github, small reminder to send me your username to the org, then you can do the review for Becky

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-24 10:17:54

Ah wait, foudn you

🙌 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-24 10:18:36

> You've invited Pete Heintzman to SPAAM-workshop! They'll be receiving an email shortly. They can also visit https://github.com/SPAAM-workshop to accept the invitation.

Repositories

Pete Heintzman (peteheintzman@gmail.com)

2020-08-24 15:35:03

How are you guys dealing with negative/positive control samples?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-24 15:35:16

We are excluding controls at the moment

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-24 15:36:01

This becomes even more messy than actual samples unfortunately, so I've decided to leave that aside for the moment. It will be eaiser to retrieve them in the future though as we can use the ERS codes to backtrack to project level and grab them

👍 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-24 17:00:46

First Issue to merge I didn't have to do anything 😍 thanks guys

Image from Android

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-24 17:04:21

🤝 @Pete Heintzman (can't find a hi-five)

🙌 Pete Heintzman, James Fellows Yates

Pete Heintzman (peteheintzman@gmail.com)

2020-08-27 14:56:35

@James Fellows Yates: what are the rules on unpublished metadata for a published sample? ie. if one has access to sample metadata that was not included in the original publication, and has not been published since, is this ok to include?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-27 14:57:05

No, must be published in some open archive

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-27 14:57:18

i.e. if it wasn't in the original publication but is on the ENA/SRA that's ok

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-27 14:57:29

I get nervous about 'data property' if it's not open like that

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-27 14:58:13

We can always update it in the future if the authors publish it somehwere else though

Pete Heintzman (peteheintzman@gmail.com)

2020-08-27 14:58:58

Thanks, and good point re. data property!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-27 14:59:26

Particularly when it comes to lat:lons, I've seen bad cases where that has been reported then the site looted 😞

😮 Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2020-08-27 20:18:16

I think the rule regarding radiocarbon dates will need to be relaxed for lake/marine sediment cores, as there are often far fewer direct dates than DNA samples (and generally very low overlap between directly dated and DNA layers). This is because bioturbation/reworking is far less of an issue, as compared to cave/archaeological sediments, and so interpolation using age-depth models is most often used to date samples.

Pete Heintzman (peteheintzman@gmail.com)

2020-08-27 20:18:29

Separately re. Graham2016/Wang2017 sample ages: There is no table of sample ages, but there are ages for some of the sample depths reported in Wooller2018 (different pub), although the exact data are in a separate repository (with their own DOI...). Have included both the publication and dataset DOIs for these sample ages. For the remaining sample ages, these have been inferred from Figure 5 in Wang2017 (and by cross-checking the data points against values in Wang2017 Appendix 2, and using Graham2016 Figure 2 for additional guidance).

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-27 20:33:35

Hmm. I see your point however it's not really good practise to start specifying 'this type of material' allows X and this type of material allows Y, as it isn't then consistent in how one should interpret the list as a whole.

I would rather make a exception for the environmental list, to allow undated layers (assuming they are from a sequence) even if they don't have dates; assuming that the sample names have a logical order and could be reconstructed.

I almost wonder if we should include depths now (as we decided against before, right @Becky Cribdon)?, but not for the purpose of comparison between datasets, but rather to help get around missing dates? What do you think (@Antonio Fernandez-Guerra too).?

I'm not sure what is useful for you guys as I'm not in the field...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 08:27:14

For Graham, that is really ugly reporting by authors 😕. But OK, I'm not sure this will pass our regex checks atm. I think it would be better to pick the most recent or more precise reporting of the date. I believe Figshare also allows linking back to the original publication?

Pete Heintzman (peteheintzman@gmail.com)

2020-08-28 10:25:58

*Thread Reply:* Yeah - that was my bad. For context, the final ages were calculated late on in the work, but the sample depths (which we using to cross-correlate all proxies) remained static throughout. re. FigShare: and vice versa, so probably ok to just cite the publication DOI if only one DOI is desirable.

👍 James Fellows Yates

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 09:10:59

Depths: good question. Would depths in the 'Dir be enough for someone reading it to see, like, this sample isn't dated, but in the sequence it is between these two which are dated, so I'll assume it's close enough to the date I'm looking for? If so, I think it would help to say what sequence samples belong to, so something like core as well as just depth?

Perhaps the exception for the environmental list could be to allow dates from age-depth models where direct dates aren't available, rather than us including depths and having the reader interpret? Or would that be like the problem of calibrated/uncalibrated dates again, where the models (and inferred dates) could change over time?

Shreya (shreya23@uchicago.edu)

2020-08-28 17:11:21

*Thread Reply:* I found this with the Armbrecht paper: depth was given for all the samples but age was only given for the deepest sample. Since all the samples were recent I was able to put down 100 years for the shallower samples too, but not sure what I would have done otherwise.

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 09:11:32

I haven't worked with age-depth models myself (thanks coronavirus)

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:13:14

I don't really know what age-depth models are (predicted, I guess?). But yes I guess it would be...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:15:24

IBut good point about the core. Is there a standard in the ways hese are reported?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:16:02

on the otherhand, core wouldn't necessarily work for cave sediments where they are taken from an excavation trench face...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:16:19

Unless we just ask please use sample names that include core/transect in it 🤔

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:49:16

@Pete Heintzman @Antonio Fernandez-Guerra thoughts?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:49:45

This might mean we have to go back and check already submitted papers, but better to do this nw befroe we get too many more samples

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:51:32

We could request sample name must include core/transect names in addition to specific sub-sample for DNA name (where possible), and if some samples in a specific core that has got dates , 'inferred' dates of layers between dated layers are allowed?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:51:36

I think that would also work

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 09:52:56

I need to think about it… i don’t like add too many semantics on the sample names

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 09:53:08

then is hell to fix errors

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:53:36

Can we assume people normally do have the core in the sample name?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:53:39

Is that common in the field?

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 09:54:59

I don’t know, @Pete Heintzman and @Becky Cribdon are more familiar to the sedaDNA field than I do

👍 James Fellows Yates

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 09:55:40

I always think with a DB oriented mind

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:56:15

Yeah, I dunno. It's difficult. Like in @Shreya’s pedersen paper the 'sample alias' listed in ENA is never used in the paper itself

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 10:09:11

*Thread Reply:* That was a right pain.

Shreya (shreya23@uchicago.edu)

2020-08-28 17:07:59

*Thread Reply:* Threw me for a loop but very grateful for @Pete Heintzman helping me link IDs to ages!!

🙌 Pete Heintzman

💪 James Fellows Yates

Shreya (shreya23@uchicago.edu)

2020-08-28 09:56:23

@Shreya has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:56:25

And is completely uninformative

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:56:35

(A bit like the BWINDI-1 in Campana)

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:56:56

But the submitted FASTQ files have informative sample IDs

Anneke ter Schure (a.t.m.t.schure@ibv.uio.no)

2020-08-28 09:58:48

Core names are usually a bit too long to include if you want workable sample names in the lab, so I think most people use a different indication like an abbreviated location name

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:59:32

Ah yeah, thanks for pointing that out @Anneke ter Schure. I didn't mean the literal core name but sample names with informative codes like the abbreivation

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 09:59:50

So Pedersen have CHLXXX and SPLXXX each lake

Anneke ter Schure (a.t.m.t.schure@ibv.uio.no)

2020-08-28 10:00:54

Yes so an abbreviation is common, but I'm not sure if it's standard

😕 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:01:53

I wonder if we might just have to allow synthetic construction of combinations of sample names reported in the paper despite it being messy for databases @Antonio Fernandez-Guerra....

Pete Heintzman (peteheintzman@gmail.com)

2020-08-28 10:41:34

*Thread Reply:* I think this may be the way to go.

👍 James Fellows Yates

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:46:37

*Thread Reply:* I don’t know if this is possible, but can we create this field with the github actions?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:46:57

*Thread Reply:* What do you mean? I mean the person submitting must work that out

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:47:17

*Thread Reply:* You have to take it from the publication. A lot of people don't give sample alias/library names in the ENA/SRA submission

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:48:27

*Thread Reply:* I mean if we have the columns defined, when doing the PR if we can have a small function that combines the different fields

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:12:38

Anyway, as I'm not in the field I won't make the decision here. @Becky Cribdon as biggest contributor you're in charge of that 😉

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 10:13:12

😐

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:15:42

what we do, usually we have a function in the DB that constructs those artificial names so it is easy to manage and fix mistakes. If you ask the curators to construct those names you are prone to have errors

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:16:34

specially if they are going to be created manually

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:49:36

Sidenote @Pete Heintzman I'm merging current master into your branch to check the new regex to prevent multiple DOIs

👍 Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2020-08-28 10:50:18

Including depth data somewhere is desirable, as this is static (even if the ages are updated by future works).

If including this in sample_name, rather than a separate data entry, then how about standardizing samples as:

[Core name/code][depth][sample-name1|sample-name2|...] or [Cave name][stratum][sample-name1|sample-name2|...]

If we do this, then we need to decide on what a ‘sample’ is (ie. are sediment sub-samples from the same layer considered the same sample?). I would suggest a sample is all data from a specific sediment layer. (apologies if this has been discussed earlier)

Happy to explain how age-depth models work, if anyone is still interested...

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 10:52:46

So, to bring back Antonio's suggestion, we could add a column for sediment samples for core/cave (sequence) name/code, and a column for depth, and GitHub would then merge those into a standardised sample name?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 10:52:59

But not "sample name" because that's already a field.

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:53:03

this would be the ideal

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 10:53:45

(I'd like to know how age-depth models work please Pete 😁 )

Pete Heintzman (peteheintzman@gmail.com)

2020-09-15 11:43:38

*Thread Reply:* @channel: here is a good primer on age-depth modelling used for dating sedimentary sequences:

Trachsel & Tedford 2017.pdf

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-18 12:48:02

*Thread Reply:* Thanks very much.

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:53:49

I am quite concerned of trusting people on the merging

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:54:14

We don’t want to have bad data

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:54:57

then if we/someone spots an error, we need to go back to change the field + name

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:55:48

Github actions and modifying PR can be dangerous unfortunately and would take time to fix.

But we already allow retroactive changes/corrections in new releases so I don't think it's so bad. Particularly as we are 'making up' sample names here anyway....

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 10:56:48

So manual merging is fine? If so, separate columns may be overkill.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:57:01

That would be my argument

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:58:44

I would keep separate columns then, it helps fixing errors and also is data we should consider including

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:58:51

and the merged one

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 10:59:35

So what are the two clumns then?

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 10:59:36

it can seem a good idea of encoding names with lot of semantic data but in the long term is painful

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 11:01:17

if we have this names, I would keep the columns where the fields in the sample construction are coming from

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 11:01:35

Yeah @Pete Heintzman’s idea is nice but I tihnk we might be getting too-far away from the original publication (so look up would be difficult). What I meant was that if there is no 'nice' name in the publication in some form that matches both what is referred to in the pbulication and in the ENA/SRA, we allow a contirbutor to synthetically merge so a reader can infer both the sample and date/depth/core info

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 11:05:49

And was there a positive consensus on having a depth column?

👍 Pete Heintzman, Anneke ter Schure

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 11:06:08

I would be down for it

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 11:06:41

Because that makes it easier I guess. Then it's just making sure the sample id has a core+(sub)sample ID

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 11:07:09

In case there are undated cores in a given publication (Which can't be used)

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-08-28 11:07:31

then we can validate the artificially created id, if there is any

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 11:08:15

I've got another meeting now: @Becky Cribdon your turn 😉

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 11:26:51

I'm working on a prototype table and README. Watch this space.

👍 Pete Heintzman

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-08-28 11:41:48

Opinions, if you please:

ancientmetagenome-environmental_mockup_2020-08-28.tsv

README_mockup_2020-08-28.md

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 12:45:15

It's a good start but personally feel that the standardised column name is redundent

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 12:45:38

As the information is otherwise already there in the toher columns

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 12:46:31

I would rather have the sequence, depth and sample_name, but we recommend that if multple sample names are reported, go for the one that is more informative (where possible)

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 12:47:44

So if you have options of BWINDI-1 vs DNA-MUB17-2C-6L-SAMPLE43, go with the latter

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-08-28 12:51:22

Or to use the example from Pedersen 2016 use CHL13211317 vs ICF-10 (as ICF are for the whole study, and CHL is the lake name)

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-01 10:31:13

Thoughts @channel ? Please see Becky's suggestion and my porposed modification to drop the standardised name.

I think we can also specify that there must be at least two direct dates in a given sequence, but then 'inferred' ages of given layers are then allowed in that sequence. How does that sound?

Pete Heintzman (peteheintzman@gmail.com)

2020-09-02 11:27:50

I agree with @James Fellows Yates’ suggestions - samplenamestandardised is unnecessary, and can easily be recreated by a user, if needed. Perhaps change ‘sequence’ to ‘sedimentary_sequence’, to avoid any confusion with DNA sequence.

👍 James Fellows Yates, Anneke ter Schure

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-03 16:24:17

How about this (the addition under sample age may not be clear enough):

ancientmetagenome-environmental_mockup_2020-09-03.tsv

README_mockup_2020-09-03.txt

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-03 20:01:46

*Thread Reply:* Btw, you could always make a draft PR, so you once everyone agrees we can sent it straight in

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-03 20:05:51

*Thread Reply:* And some tweaks (which is why I suggest the draft PR ;)):

```## sedimentary_sequence

Sediment only
Identifier for sequence sample was taken from, e.g. "core3", or zonea19
- Typically cores, or quadrant of excavation

If not reported, NA```

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-03 20:06:31

*Thread Reply:* ```## sample_name

Unique identifier for that sample as used in the publication
If samples are referred to by multiple names, use the most informative
If samples cannot be *directly* linked to data files by any names in the publication, generate names in the format e.g. [sequence][depth][original name]

> ⚠️ Mandatory value```

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-03 20:07:38

*Thread Reply:* otherwise I'm happy with it!

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 12:15:35

*Thread Reply:* Agree - this looks good!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:15:57

*Thread Reply:* @Becky Cribdon time for a PR! We will need to work out how to retroactively go back through published papers though...

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 12:38:59

*Thread Reply:* Changes noted and draft PR created 🙂

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 08:09:26

I just realised someone cited the following review in the preprint, would someone be willing to go through it and pick out papers it cites that we might be missing?

Edwards, M. E. (2020). The maturing relationship between Quaternary paleoecology and ancient sedimentary DNA. Quaternary Research, 96, 39–47. https://doi.org/10.1017/qua.2020.52

Cambridge Core

The maturing relationship between Quaternary paleoecology and ancient sedimentary DNA | Quaternary Research | Cambridge Core

The maturing relationship between Quaternary paleoecology and ancient sedimentary DNA - Volume 96 - Mary E. Edwards

Original URL: https://doi.org/10.1017/qua.2020.52

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 08:09:57

I have a feeling we are still sort of slim on the sedaDNA papers. Wasn't that even what Eske Willerslev started on (a while ago)

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 09:05:49

The vast majority of sedaDNA paper, including the majority of papers cited by Edwards, are based on metabarcoding.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 09:06:11

Oh really?

✔️ Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 09:06:14

(that includes Eske's massive 2014 paper)

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:06:17

there are not so many with shotgun

😱 James Fellows Yates

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:06:54

but slowly is changing

🎉 Pete Heintzman, James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 09:07:04

I'm surprised. I thought there was relatively big interest/utility/funding in that

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 09:07:23

There is, but there are a lot of technical challenges

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 09:07:31

There are also a lot of studies underway

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:07:37

I thought as well when started with sedaDNA

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 09:07:41

A lot will be published in the next year or two

:mask_parrot: James Fellows Yates

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:07:54

🔝

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:08:55

when I switched I was surprised by the lack of computational methods (my thing)

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:09:16

specially for functional analyses

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 09:09:46

Going back to Edwards (2020), the only two refs of note are: • Parducci et al. (2019), but this looks to be the same data as Ahmed et al (2018) - the latter is in the database, but may want to check for updated metadata. • Lammers et al. (2020) - this is a cool one, but currently a preprint. A revision has been submitted, and we are hoping it will be accepted soon...

👍 James Fellows Yates

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:10:10

we are developing some new stuff to recover i.e plant traits from the shotgun data

👍 Pete Heintzman

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:10:42

and ways to authenticate the DNA at the same time we assemble, so no need of mapping

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 09:10:49

(Back to the 'Dir: if anyone fancies a fairly straightforward environmental review, perhaps your first, Braadbaart2020 was quite well reported as they go)

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 09:11:33

Ooh, we're working on authentication too: something like mapDamage for metagenomics. What does yours do?

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:11:55

this doesn’t work like mapDamage

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:12:58

we have three different lines, one based on Deep Learning, one optimization of mapDamage for large datasets, and the one that works at the assembly level with a hybrid approach with nucleotides and amino acids to access to the low abundant ones

👍 Becky Cribdon

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 09:13:19

@Pete Heintzman would you be OK adding me to the environemntal core team on github? Then Becky can for example just ping that team and hopefully one of the people can do a review, which would help with task distribution

Pete Heintzman (peteheintzman@gmail.com)

2020-09-04 10:22:23

*Thread Reply:* Sure… but i have no idea how to that. Pointers?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 11:31:37

*Thread Reply:* you don't need to do anything

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 11:33:57

*Thread Reply:* @Becky Cribdon @Pete Heintzman made you a new team. So in the future people on submitting environmental PRs can go @thedir-team-dirt for reviewers or help, and you both get notificaitons.

👍 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 11:34:11

*Thread Reply:* https://github.com/orgs/SPAAM-workshop/teams/thedir-team-dirt/members

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 12:39:38

*Thread Reply:* Heh heh, team dirt. I like it.

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 12:40:09

*Thread Reply:* Shame it's only us three.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:40:51

*Thread Reply:* WE will get thre 💪

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:41:19

*Thread Reply:* I'm wondering if I should submit a second abstract to ISBA about this...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:41:38

*Thread Reply:* And if there are any sedaDNA conferences feel free to present there about this!

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 12:41:52

*Thread Reply:* Abstract about what precisely?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:41:58

*Thread Reply:* the Dir

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-09-04 12:42:04

*Thread Reply:* Oh right, definitely.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:42:05

*Thread Reply:* Tell people it's here, what it does, etc

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-04 12:42:11

*Thread Reply:* Want to recruit more contributors

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-04 09:16:51

on the functional side there is a lot to be done and many unknowns

👍 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-07 09:06:48

@Becky Cribdon I've added to your PR for the structure change the new columns to the TSV, and also the JSON schema to check against.

However we need someone to go back and update those two columns for the already added papers. Is anyone willing to volunteer?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-07 09:21:08

@Anneke ter Schure as you are only occasionally here (so I guess maybe busy) , maybe you would be willing to do a small review for Becky: https://github.com/SPAAM-workshop/AncientMetagenomeDir/pull/268/files only 7 samples form the same site!

Anneke ter Schure (a.t.m.t.schure@ibv.uio.no)

2020-09-07 09:23:49

I'll check it out this afternoon!

❤️ James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-08 08:28:49

Thoughts https://github.com/SPAAM-workshop/AncientMetagenomeDir/pull/268#issuecomment-688640430?

We can add this to Becky's restructuring PR if you guys agree

} jfy133 (https://github.com/jfy133)

Comment on #268 Braadbaart2020

Good point actually. I guess we have two options. We could either request that term from ENVO, which woild take longer and there is no guarantee. Or, the easier route (with precedence) is to add another column called 'site_type'. Then we can specify: cave, rock shelter, Ocean, lake etc. Thosr location like terms are already present in envO I believe.

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-08 08:46:01

instead of site_type I would try to get it in one of the envO categories, i.e: • envbiome: Descriptor of the broad ecological context of a sample. • envfeature: Compared to biome, feature is a descriptor of a geographic aspect or a physical entity that strongly influences the more local environment of a sample • env_material: Descriptor of the material that was displaced by the sample, or material in which the sample was embedded, prior to the sampling event. With those you might be able to describe where the sample comes from. You can use any of the envO terms to define them and not overload the terms. If you remember our mail with Pier:

```Thus, for annotating a sample of material that a microbiome was extracted from, I would:

* Identify the broad scale environment (usually biome-level class) to characterise where the site was (tundra, desert, etc), being as specific as possible * Identify one or more local scale entities. These can include the site itself (burial site, midden, etc) as well as any tools or other objects (we can add those as needed) ** Identify the material or materials that were inputs to the biomass/DNA extraction process from the site (soil, sediment, scrapings from tools, etc).```

👍 James Fellows Yates

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-08 08:48:06

this would imply have extra columns to fit broad scale, local scale and medium scale terms

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-08 08:48:52

I don't want to do go that far though because it leads to swamping of the table. I think the second would be sufficient for here though.

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-08 08:49:25

we can have feature column

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-08 08:49:44

in this case would be cave and material sediment

👍 James Fellows Yates, Anneke ter Schure, Pete Heintzman

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2020-09-08 08:49:59

both in envO

Katerina Guschanski (katerina.guschanski@ebc.uu.se)

2020-09-10 09:55:58

@Katerina Guschanski has joined the channel

Kun Huang (kun.huang@unitn.it)

2020-09-11 09:13:46

@Kun Huang has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-14 15:02:24

OK while I was waiting for some code to run I updated @Becky Cribdon’s PR with teh new fields.

We have the new columns:

sedimentary_sequence
depth
feature

I've added the corresponding features to the feature column (as that was easy to infer from the material), but the other two are still identified as 'unknown', so I think we can decide whether we merge this now (after one more review, maybe from @Pete Heintzman to check phrasing of categories are OK), and then retroactively update the depth/core IDs; OR should we wait for those columns to be filled before we merge?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-14 15:05:01

ah and @Anneke ter Schure just approved the Braadbaart paper just as Iwrote that message, awesome! You can merge now @Becky Cribdon! https://github.com/SPAAM-workshop/AncientMetagenomeDir/pull/268

} Becky-Cribdon (https://github.com/Becky-Cribdon)

#268 Braadbaart2020

Pull Request This PR is for a ☑︎ <#new-publication|New Publication(s)> ☐ <#correction|Correction> For the follow list(s) ☐ ancientmetagenome-anthropogenic (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-anthropogenic">README</a>) ☑︎ ancientmetagenome-environmental (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-environmental">README</a>) ☐ ancientmetagenome-hostassociated (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-hostassociated">README</a>) ☐ ancientsinglegenome-hostassociated (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientsinglegenome-hostassociated">README</a>) New Publication Publication Information This pull request is to add samples from the following publication(s): Braadbart2020 10.1016/j.jasrep.2020.102468 This is to close <a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/issues/111">#111</a> Checklist ☑︎ Publication is published (preprints currently not accepted)? ☑︎ Checked the publication is not already in the database? ☑︎ Checked samples in this publication are not previously published data (newly re-sequenced metagenomes are OK!)? ☑︎ Samples are shotgun metagenomes (hostassociated-singlegenome may also contain whole-genome enriched data)? ☑︎ Checked the list follows conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA)? ☐ (If applicable) Made a separate PR to add new categories to controlled lists (stored unnder <code>/assets/enums</code>, e.g. material or archive)? If so: ☑︎ Changelog is updated to include the publication under 'Added'? If you do not know how to check errors in failed validation checks, expand here <ol><li>Press 'details' next to the failed check.</li><li>Expand the <code>test ancient &lt;list&gt;</code> line with the red X next to it.</li><li>Scroll to the bottom of the log, and look for a <code>DatasetValidationError</code> (usually the last line).</li><li>Read the error, and fix accordingly. Check the README for a given list for more guidance. If in doubt, ask!</li> </ol> Correction This PR is for ☐ ancientmetagenome-hostassociated ☐ ancientsinglegenome-hostassociated ☐ ancientmetagenome-environmental ☐ ancientmetagenome-anthropogenic Reference This pull request is to correct samples from the following publication(s): This is to close Description The issue is: Checklist ☐ Checked the corrected entries follow conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA) ☐ Changelog is updated to include the publication under 'Changed'?

Labels

good first issue

Comments

Pete Heintzman (peteheintzman@gmail.com)

2020-09-15 11:39:01

re. Braadbaart2020: all looks good. Now merged with master.

re. environment README edit for new sediment fields and inferred dates: made a few minor edits - all looks good. Not sure why the checks are failing.

👍 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-15 12:01:07

Checks will fail until the JSON is merged into matter

👍 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-15 12:01:33

Master, as the tool checks off that branch because it reads off a github pages webpage

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-15 12:02:50

I would suggest that maybe we just merge that now and make an issue for someone to go back and update all the already added entries to add depth and core etc

👍 Becky Cribdon

Pete Heintzman (peteheintzman@gmail.com)

2020-09-15 12:03:14

Sounds reasonable.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-15 20:09:48

ok, proceeding with the merge!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-15 20:25:27

Done, and issue made: https://github.com/SPAAM-workshop/AncientMetagenomeDir/issues/284

} jfy133 (https://github.com/jfy133)

#284 Add sequence/depth/feature/date information for environmental metagenome list of release v20.09 pubs

Due to difficulties for the <a href="https://github.com/orgs/SPAAM-workshop/teams/thedir-team-dirt">@SPAAM-workshop/thedir-team-dirt</a>'s retrieval of relevant/related sedaDNA samples, we added new columns: • sequence • depth • feature And modified specification for • date (dates for modelled dates if at least two samples in sequence are dated are now allowed) All samples for release v20.09 and also Braadbaard 2020, need to be re-scraped for the above data - curreently they are listed as 'Unknown'.

Labels

bug

Milestone

<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/milestone/2">Release v20.12: Ancient City of Nessebar</a>

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-09-15 20:25:40

So if anyone knows any enthusastic students who wanna help out 😅

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-13 13:54:17

Sooo @channel feedback time!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-13 13:54:25

The reviewer said:

"A general thought on the environmental metagenomics category (mostclosely aligned with my expertise): a column indicating the taxonomic fo-cus of the published study might be useful, so that there is some indicationof what alignments the authors performed."

What do you think about this?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-13 13:54:34

Is this necessary/useful/worth it?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-13 15:56:48

*Thread Reply:* Maybe it's relevant if there's been some selection or enrichment before generating the data?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-13 15:58:17

*Thread Reply:* But I'm imagining a typical user wanting to find comparable datasets, not really the analysis methods (i.e. what alignments).

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-13 15:59:21

*Thread Reply:* But, if the data itself is taxonomically limited, maybe it would be helpful to know that.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-13 16:01:03

*Thread Reply:* Yeah I was also thinking that

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-13 16:01:31

*Thread Reply:* like mt data from Slon et al. (not that we specify the data, but they may only upload data along those lines 😏 )

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-13 16:01:40

*Thread Reply:* So we have a tentative yes from yuo?

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:32:42

*Thread Reply:* I agree with Becky's suggestion, although it is slightly different from what the reviewer is specifically asking.

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:34:50

*Thread Reply:* tl;dr: I don't think the reviewer's suggestion in necessary, otherwise we would need to update every time someone reanalyzes the data.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:35:22

*Thread Reply:* In my specification I have stated 'the original purpose' of the data.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:35:57

*Thread Reply:* Bceause I agree that would come complicated, but assuming it's the same DNA extraction that I think is the main influence you would have downstream

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:39:06

*Thread Reply:* OK - that works

👍 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:39:44

*Thread Reply:* I guess it makes sense, for example people looking for just hominin DNA might do some funky stuff in the future to try and enrich for that

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:40:15

*Thread Reply:* Or if someone did some fine-filtering of the sediment to remove certain things (that might bias results)

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:43:03

*Thread Reply:* There are some seriously deep rabbit holes if one starts going into differences in wet lab methods used too deeply. As a compromise, I think enrichment v. not of libraries is the most important.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:46:24

*Thread Reply:* Yeah sorry, I actually meant that by just saying 'they looked at animals or bacteria' that you can ignore that whole thing of lab-methods completely.

By using eukaryotic vs microbial, I would hope and assume people who are interested in animal DNA will do similar things (and would only download that data), and people who are only interested in microbial DNA would do similar things and look for that data...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 10:36:50

@Anneke ter Schure I know you weren't involved in the manuscript directly, but do you have any opinions on this? Would you find it useful do you think? What about you @Antonio Fernandez-Guerra?

Maybe something like:

study_type: eukaryotic, microbial, or faunal, floral, faunal_flora, bacterial, virus?

I think this would have to be a 'custom' content (rather than based on an ontology).

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:40:56

*Thread Reply:* Perhaps use NCBI taxonIDs as these will offer the most flexibility (from single species to phyla)?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:42:46

*Thread Reply:* I looked at that actually! And while it was a good idea at first, I observed just now that people filtered in lots of weird an wonderful ways... which would result in very long lists. So far all the papers could be split into: faunal, floral, or microbial (or some combination of the three), in my opinio nat least.

🙌 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:03:35

Does any of you know what a 'core drive' exactly is?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:03:42

image.png

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:30:30

*Thread Reply:* A core drive is a section of a sediment core. Some coring methods recover one drive (e.g. 1 m) of a core at a time from the same hole. Multiple holes are cored so that there are overlapping drives to reconstruct a composite core record (as the ends of drives tend to be disturbed).

The details in your example: 1B-1B_23-25 1 = coring site on the lake (in this case, centre) B = hole code 1 = core drive/section B = core section is cut in half longitudinally. Half B was sampled for DNA 23-25 = depth in cm into the drive that DNA sample was derived from

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:31:23

*Thread Reply:* Ahh cool thank you

👍 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:33:26

@Becky Cribdon @Pete Heintzman I've just pinged you for a review on Github. I've added all the depth, sequencename and also a studytype column (as I didn't see any harm in it) to make the reviewer happy. I don't think you need to go through massively in-depth for the review, but if you could check a couple of papers that the depth/sequence_name looks good that would be great.

Also feel free to change the study type column.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:34:12

(@Pete Heintzman maybe you can update Graham et al for the sequence_name, based on drive thing you just explained for me ;))

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:35:05

*Thread Reply:* np

❤️ James Fellows Yates

Pete Heintzman (peteheintzman@gmail.com)

2020-11-16 13:46:10

@James Fellows Yates: For the reviewer response, if you want an example of how sedaDNA data has already reused (shameless plug): Wang et al. 2017 looked for woody plants by reanalyzing the Graham et al. 2016 data, which was originally generated to look for woolly mammoth.

😆 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 13:46:54

I see, so you can say we would rather not specify that category because people will re-use for their own purposes?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-16 14:23:34

*Thread Reply:* Personally, I would rather not specify "study type" if it hasn't affected the raw data the 'Dir is pointing to.

How about changing the name from "study type" to something like "enrichment", to mark whether the sample has been manipulated to change the taxonomic profile? So, if they enriched for mammal DNA it would be "faunal"?

But, for example, Gaffney 2020 was a floral study, but the raw data contains everything. We just picked out the plants for further analysis. So I wouldn't want to specify a "study type" for that: that could suggest the data is more limited than it is.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:30:35

*Thread Reply:* Yeah good point

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:30:49

*Thread Reply:* I've not looked specifically if any filtering was done though

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:31:11

*Thread Reply:* So we need a new title 🤔 .

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:31:35

*Thread Reply:* taxonomic_category?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:31:43

*Thread Reply:* originaltaxonomicpurpose?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-16 14:33:08

*Thread Reply:* Could you expand on "I've not looked specifically if any filtering was done"? What do you mean by filtering here?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:34:33

*Thread Reply:* As far as I understood the reviewer:

" A general thought on the environmental metagenomics category (most closely aligned with my expertise): a column indicating the taxonomic focus of the published study might be useful, so that there is some indication of what alignments the authors performed. "

he wants to find comparative datasets e.g. if he is interested in looking at e.g. floral eDNA from different sites at the same time period

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:35:09

*Thread Reply:* (I say he as I'm pretty sure it's Mike Bunce who reviewed 😉 )

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:35:27

*Thread Reply:* But he wants to comapre results of the original paper

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:36:09

*Thread Reply:* I am assuming that this implies that he would assume there were upstream 'decisions' that may influence extraction strategy for example.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:37:11

*Thread Reply:* So he wants to make sure these are approximately like-by-like.

what I meant in my message above, is that the data I have input into the PR under the current column study_type, I've just done it based on what the originally publication was focusing on

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:37:22

*Thread Reply:* I have no actually looked if there was any enrichment

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:38:14

*Thread Reply:* (sorry if this still isn't coming off as clear 🤦

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:38:36

*Thread Reply:* What about just study_taxonomic_focus?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-16 14:43:37

*Thread Reply:* Sorry, I'm finding this really complicated! But thanks for clarifying your statement.

So, for the reviewer, he would be searching the 'Dir for floral data and wants a column that would let him exclude a study that enriched for mammal DNA, because there's unlikely to be any floral DNA for him to use? I agree that would be useful.

But I think "taxonomic focus" or similar are still too broad: having a taxonomic focus doesn't necessarily mean the data is taxonomically limited. His assumption that upstream influences downstream wouldn't always hold, and could be misleading.

If I wanted mammal DNA and Gaffney 2020 had a "focus" of plants, would that suggest I exclude it? Despite it actually having raw data for everything?

Is there a title that implies actual limitation only, so studies like Gaffney 2020 could be "none" or "NA"?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:48:27

*Thread Reply:* No don't worry. I'm also finding it complicated. To be clear: I am assuming that he may wonder if there is enrichment has happened, but I don't know for sure. I think I may have confused both you and @Pete Heintzman about mentioning enrichment 🤦 . We shouldn't focus on that (I was reading about TARA oceans the other day and they did particle filtering so maybe that's why I brought it up).

Looking through the comments, what he might actually be thinking about is 'meta-analysis' (which he says is something important that we are missing from the 'benefits' of the project). So he's sort of looking for some basically annotated bibligraphic info, maybe?

In which case it's actually a bit detached from the purpose of the project but I understand him from that point of view.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 14:49:18

*Thread Reply:* So to take your example above:

> So, for the reviewer, he would be searching the 'Dir for floral data and wants a column that would let him exclude a study that enriched for mammal DNA, because there's unlikely to be any floral DNA for him to use? I agree that would be useful. Rather than enrichment, maybe what he means is that he's look for other studies that looked at flora. So he can then take their already publisehd 'OTU tables' (so to say), to compare with his own?

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-17 10:39:29

*Thread Reply:* Riiight, so instead of a flag for limitations, this column would be like a flag for particular detail?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-17 10:40:08

*Thread Reply:* yeah sort of. What was the original purpose for the generation of that particular data

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 16:01:57

*Thread Reply:* @James Fellows Yates’s last comment is how I interpret the reviewer’s comment. However, having a separate column for shotgun/enrichment would also be very valuable to inform re-analysis of the data sets. (starting a new topic to continue this)

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 16:06:43

Instead of study_type what about original_study_focus? or study_primary_focus or something like that?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-16 16:07:03

primary I guess would be good in the sense it implies you can do aother stuff

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-17 10:40:13

*Thread Reply:* Yes, I prefer "primary". Thanks for clarifying this above.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-17 10:42:35

*Thread Reply:* Ok, will update

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-17 10:49:43

OK @Pete Heintzman @Becky Cribdon (a.k.a. team-dirt ;)) the PR should be ready for your review now 🙂

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-17 11:06:03

*Thread Reply:* On it :)

❤️ James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-17 12:33:25

@Pete Heintzman Becky has approved so if you're happy I'll merge

✔️ Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2020-11-17 12:36:31

Will get to this later this afternoon -- other ms fires to put out first...

🔥 James Fellows Yates, Pete Heintzman

:male_firefighter: James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-17 12:36:37

No problem!

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 14:20:50

OK - created a pull request to update some other details (sequence, depth, ages, etc) — an easy review for @Becky Cribdon.

💪 James Fellows Yates

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 14:21:09

*Thread Reply:* New ages:

Screen Shot 2020-11-18 at 14.19.57.png

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 14:27:25

*Thread Reply:* By the way, the sequence data for the new samples, that are not in Ahmed2018, do not seem to have been made available.

Becky Cribdon (r.cribdon@warwick.ac.uk)

2020-11-18 15:03:09

*Thread Reply:* Looks legit 🙂 Approved and merged.

🙌 Pete Heintzman

:mask_parrot: James Fellows Yates

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 14:38:50

re. point 16 of the reviewer response letter, I have added: An exception to this approach is for environmental samples, many of which are not directly dated (ages are inferred from a sedimentary sequence, such as an age-depth model for a lake sediment core) or are dated using alternative methods (OSL, tephras, U-series, etc). For these reasons, we make an exception and use calibrated ages for these samples (which we have stated in the README for environmental samples).

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-18 14:39:52

Ok! I might standardise the response a bit, but it'll make him happy I guess.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-18 14:40:14

Are all age-depth models calibrated?

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 14:41:56

By convention, yes. As one cannot infer a radiocarbon age, as this is a direct measurement.

👍 James Fellows Yates

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 16:02:04

re. point 24: How about:

“We thank the reviewer for raising this point. We have now added a column for environmental samples called ‘studyprimaryfocus’, which gives a broad overview of which group the data were originally targeting (faunal, floral, microbial). We only include the focus of the original study and a broad grouping because: (1) as reference databases grow and are refined, future analyses targeting the same groups will yield different results (hopefully with improved precision), and (2) it is beyond the scope of AncientMetagenomeDir to keep track of all re-analyses of each data set. We note that this second point has already occurred (e.g. Wang et al. 2017 looked for woody plants by reanalyzing the Graham et al. 2016 data, which was originally generated to look for woolly mammoth).

The reviewer’s point also inspired us to create a new category of datatype to differentiate between shotgun and target enriched data, as this information would guide which samples to include in a reanalysis. The new types are ‘shotgunmetagenomic’, ‘targetenrichedhuman’, and ‘targetenrichedmammal’, but these will be expanded as new bait panels are developed and applied.”

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 16:04:06

Note that this latter point will also apply to other metagenomic categories

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-18 19:09:45

Ok, I like point one

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-18 19:10:10

Wil ladd that (thanks!). Point two I'm not so sure about though as when we are talking about enrichment, this is library level information not sampl elevel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-18 19:11:50

And that's already something else I plan add in a library-level extension of 'the Dir', if @Katherine Eaton is able to semi-automate the process of pulling that data for us. So I think I'll leave that out for now.

👍 Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2020-11-18 20:54:44

Sounds good!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-11-23 14:07:37

Y/N? https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7643826/

PubMed Central (PMC)

Sediment Metagenomes as Time Capsules of Lake Microbiomes

The reconstruction of ecological time series from lake sediment archives can retrace the environmental impact of human activities. Molecular genetic approaches in paleolimnology have provided unprecedented access to DNA time series, which record evidence ...

Original URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7643826/

Pete Heintzman (peteheintzman@gmail.com)

2021-03-11 14:51:51

*Thread Reply:* Y. For the bottom sediment samples only.

👍 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2020-12-08 21:04:28

@Becky Cribdon did a review of Sarkissian. That is indeed a wierd one, but i've simplified it a bit (see the comment on the PR - should just be a copy-and-paste job)

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-03-10 16:31:26

@Pete Heintzman https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7887274/ one for 'the Dir?

PubMed Central (PMC)

Environmental palaeogenomic reconstruction of an Ice Age algal population

Palaeogenomics has greatly increased our knowledge of past evolutionary and ecological change, but has been restricted to the study of species that preserve either as or within fossils. Here we show the potential of shotgun metagenomics to reveal population ...

Original URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7887274/

Pete Heintzman (peteheintzman@gmail.com)

2021-03-11 14:41:02

For sure! 😉

Pete Heintzman (peteheintzman@gmail.com)

2021-03-11 14:46:16

This one too: https://onlinelibrary.wiley.com/doi/full/10.1111/1755-0998.13311

Pete Heintzman (peteheintzman@gmail.com)

2021-03-11 14:54:07

OK — all added as issues. Note that there will be a deluge of sedaDNA samples coming out this year!

Pete Heintzman (peteheintzman@gmail.com)

2021-03-11 14:56:25

@James Fellows Yates All of these ‘issues’ are small, so will make good training exercises for the uninitiated.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-03-11 15:24:41

Busy busy! Ok, I might leave it for the June release then

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-03-12 11:28:04

Need one approval by someone here @channel!

https://github.com/SPAAM-community/AncientMetagenomeDir/pull/377

} jfy133 (https://github.com/jfy133)

#377 Add Rampelli 2021

Pull Request This PR is for a ☑︎ <#new-publication|New Publication(s)> ☐ <#correction|Correction> For the following list(s): ☐ ancientmetagenome-anthropogenic (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-anthropogenic">README</a>) ☑︎ ancientmetagenome-environmental (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-environmental">README</a>, ask @thedir-team-dirt for advice) ☑︎ ancientmetagenome-hostassociated (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientmetagenome-hostassociated">README</a>, ask @thedir-team-bugparty for advice) ☐ ancientsinglegenome-hostassociated (<a href="https://github.com/SPAAM-workshop/AncientMetagenomeDir/tree/master/ancientsinglegenome-hostassociated">README</a>, ask @thedir-team-pathopeeps for advice) New Publication Publication Information This pull request is to add samples from the following publication(s): <a href="https://www.nature.com/articles/s42003-021-01689-y">https://www.nature.com/articles/s42003-021-01689-y</a> This is to close <a href="https://github.com/SPAAM-community/AncientMetagenomeDir/issues/364">#364</a> Checklist ☑︎ Publication is published (preprints currently not accepted)? ☑︎ Checked the publication is not already in the database? ☑︎ Checked samples in this publication are not previously published data (newly re-sequenced metagenomes are OK!)? ☑︎ Samples are shotgun metagenomes (hostassociated-singlegenome may also contain whole-genome enriched data)? ☑︎ Checked the list follows conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA)? ☑︎ Changelog is updated to include the publication under 'Added'? ☐ Pull request has passed validation checks (see automated comment from <code>github-bot</code>)? ☑︎ Review requested from corresponding team (see list of lists above for each team)? ☑︎ (If applicable) Made a separate PR to add new categories to controlled lists (stored under <code>/assets/enums</code>, e.g. material or archive)? If so the corresponding PR is here: If you do not know how to check errors in failed validation checks, expand here Correction This PR is for ☐ ancientmetagenome-hostassociated ☐ ancientsinglegenome-hostassociated ☐ ancientmetagenome-environmental ☐ ancientmetagenome-anthropogenic Reference This pull request is to correct samples from the following publication(s): This is to close Description The issue is: Checklist ☐ Checked the corrected entries follow conventions as described in the corresponding sample type's README file (e.g. using ERS/SRS accession codes for ENA/SRA) ☐ Changelog is updated to include the publication under 'Changed'? ☐ Pull request has passed validation checks (see automated comment from <code>github-bot</code>)? ☐ Review requested from corresponding team (see list of lists above for each team)?

Comments

Reviewers

@SPAAM-community/thedir-team-dirt

✅ Pete Heintzman

Nico Rascovan (nicorasco@gmail.com)

2021-03-18 23:14:38

@Nico Rascovan has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-07 20:34:17

@Pete Heintzman I've gone through all three. The Liang and Schulte I have a couple of comments, but otherwise thanks for the speedy work 👍

Pete Heintzman (peteheintzman@gmail.com)

2021-06-09 19:28:07

@James Fellows Yates re. Liang: went with 'Sea coast', as not technically the shore. If OK, then I have already updated this in the Enum and Liang PR.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-09 19:28:26

*Thread Reply:* Perfect!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-09 19:28:40

*Thread Reply:* Go ahead and merge both!

Pete Heintzman (peteheintzman@gmail.com)

2021-06-09 19:29:28

*Thread Reply:* Still needs a 'review' 😉

Pete Heintzman (peteheintzman@gmail.com)

2021-06-09 19:30:00

*Thread Reply:* For Enum PR

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-09 19:41:24

*Thread Reply:* Oh pants. One sec

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-09 19:42:44

*Thread Reply:* Done!

Pete Heintzman (peteheintzman@gmail.com)

2021-06-09 19:29:11

re. Schulte: yeah, I think at some point we will need to re-visit the target enrichment rule. This is probably going to become a more common feature of sedaDNA metagenomics studies.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-09 19:43:59

*Thread Reply:* Hmm ok. We can keep an eye on it then. I guess we do have precedence with the pathogen stuff, but that stuff is often more clear cut because each paper is basically referring to a single taxon, but it might be more variable with sedaDNA...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-21 14:23:43

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8208683/ one to include?

PubMed Central (PMC)

Shotgun metagenomics reveals distinct functional diversity and metabolic capabilities between 12 000-year-old permafrost and active layers on Muot da Barba Peider (Swiss Alps)

The warming-induced thawing of permafrost promotes microbial activity, often resulting in enhanced greenhouse gas emissions. The ability of permafrost microorganisms to survive the in situ sub-zero temperatures, their energetic strategies and their metabolic ...

Original URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8208683/

Pete Heintzman (peteheintzman@gmail.com)

2021-06-21 17:04:01

I think they were looking at the living microbial component in the permafrost rather than dead 12 kyr-old aDNA, but it is unclear.

Suggest not to include.

👍 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-23 10:30:52

Other than PIA (from @Becky Cribdon) what tools exist for seda/eDNA dedicatd to ancient samples?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-23 10:31:15

or designed for tackling issues that are particularly pertinent to ancient DNA?

Pete Heintzman (peteheintzman@gmail.com)

2021-06-23 18:01:10

There are not many specialist tools, although Benjamin Vernot's Kallisto would be another

Pete Heintzman (peteheintzman@gmail.com)

2021-06-23 18:02:04

And PathPhynder by Bianca de Sanctis and co.

👍 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-23 18:02:22

Kallisto isn't aDNA focused though right?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-06-23 18:03:05

That's for rna-seq or transciptomics and canabilised for their purposes iirc

Pete Heintzman (peteheintzman@gmail.com)

2021-06-23 18:04:36

Haven't deep-dived into the version Vernot uses, so not sure if it was modified/optimized for sedaDNA or is just the rna-seq version

👍 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-07-12 14:57:14

@channel y/n? https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC8257590/

PubMed Central (PMC)

Holocene life and microbiome profiling in ancient tropical Lake Chalco, Mexico

Metagenomic and traditional paleolimnological approaches are suitable to infer past biological and environmental changes, however, they are often applied independently, especially in tropical regions. We combined both approaches to investigate Holocene ...

Original URL: https://www.ncbi.nlm.nih.gov/labs/pmc/articles/PMC8257590/

Pete Heintzman (peteheintzman@gmail.com)

2021-07-16 16:35:30

*Thread Reply:* Yes, and issue added.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-07-12 20:51:55

https://www.cell.com/current-biology/fulltext/S0960-9822(21)00818-6|https://www.cell.com/current-biology/fulltext/S0960-9822(21)00818-6

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-07-12 20:52:09

Another one? This one is more borderline though

Pete Heintzman (peteheintzman@gmail.com)

2021-07-16 16:33:09

*Thread Reply:* Would say yes, although I see that all raw reads are combined into a single accession/fastq. Is there a policy on this? @James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-07-16 16:54:40

*Thread Reply:* The read IDs should allow you separate i think...?

Or is it multiple samples merged into one?

Pete Heintzman (peteheintzman@gmail.com)

2021-07-19 13:17:51

*Thread Reply:* Multiple samples merged into one FASTQ... as far as I can tell.

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-07-19 13:19:08

*Thread Reply:* 🤦

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-07-19 13:19:24

*Thread Reply:* That's really shitty

Barbara (bbmoguel@gmail.com)

2021-09-09 15:42:49

@Barbara has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-09 15:45:19

Hi @Barbara I just prepared your Lake Chalco data for AncientMetagenomeDir, and I noticed that in the uploaded metadata to the SRA that you sequenced on NextSeqs.

But in your publication you say you sequenced on TruSeq nano and on Nextera XT platforms? But these are library prep kits not the sequencing machines...

So I just wanted to confirm when we start generating library metadata, that it is indeed NextSeqs you sequenced on for everything

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-09 15:45:45

(if you wanted to check the metadata anyway, the PR is here: https://github.com/SPAAM-community/AncientMetagenomeDir/pull/431, you can see under 'files changed')

GitHub

Moguel2021 by jfy133 · Pull Request #431 · SPAAM-community/AncientMetagenomeDir

Pull Request This PR is for a New Publication(s) Correction For the following list(s): ancientmetagenome-anthropogenic (README) ancientmetagenome-environmental (README, ask @thedir-team-dirt...

Original URL: https://github.com/SPAAM-community/AncientMetagenomeDir/pull/431

Barbara (bbmoguel@gmail.com)

2021-09-09 15:56:36

OMG... ley me check... For sure thete was not nextera. Must be a problem when I upload tge data

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-09 15:56:51

No worries! The data on the ENA looks correct

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-09 15:56:55

it's just in the methods section

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-09 15:57:21

image.png

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-09 15:57:50

(and figu2).

If the metadata on the SRA is correct its fine, that's where people who need to know would get their data from 😉

Barbara (bbmoguel@gmail.com)

2021-09-09 16:06:16

Ok let me check very well. I am teaching now, but when I finish I will check. Thank you

👍 James Fellows Yates

Barbara (bbmoguel@gmail.com)

2021-09-10 03:16:00

Yes, I read that Vernot in his paper adapted Kallisto to seda DNA and works very well. :)

} James Fellows Yates (https://spaam-community.slack.com/team/UPVPSB7V2)

That's for rna-seq or transciptomics and canabilised for their purposes iirc

👍 James Fellows Yates, Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2021-09-10 17:35:37

@Barbara @James Fellows Yates Am reviewing Moguel2021, but need some clarifications: • Samples S1 and S2 look to be modern top sediments, and so should not be included in the 'dir? • Samples 501 and 502 are <6000 years in the pub. but 3000 yrs in the 'dir. Where did the 3 ka ages come from? I note that these samples are from different depths, so may not be the same age? • There is a discrepancy on ENA: SRS4426892 and SRS4426891 are either sample 501 or S2, depending on whether one looks at the experiment or sample alias. Which is which?

GitHub

Moguel2021 by jfy133 · Pull Request #431 · SPAAM-community/AncientMetagenomeDir

Pull Request This PR is for a New Publication(s) Correction For the following list(s): ancientmetagenome-anthropogenic (README) ancientmetagenome-environmental (README, ask @thedir-team-dirt...

Original URL: https://github.com/SPAAM-community/AncientMetagenomeDir/pull/431

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-10 17:37:15

*Thread Reply:* I thought we said for sequence cores we kept also modern?

I took the dates ftom the mid point of one of the schematic diagrams

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-10 17:37:51

*Thread Reply:* For the last one of if mismatch is ENA then that's up to Barbara

Pete Heintzman (peteheintzman@gmail.com)

2021-09-10 17:39:29

*Thread Reply:* Thanks for the quick clarification, James!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-10 17:40:15

*Thread Reply:* Please check I got the dates right though...

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-10 17:40:45

*Thread Reply:* Or you make the same rule-of-thumb interpretation

Barbara (bbmoguel@gmail.com)

2021-09-10 17:38:07

hello, yes you right the S1 and S2 ares superficial samples

Barbara (bbmoguel@gmail.com)

2021-09-10 17:40:12

and the 501 and 502 there is a problem because sediments are mixed because of that we decide to just put more than 5,000 years! It is so hard to obtain a good dating

Barbara (bbmoguel@gmail.com)

2021-09-10 17:41:14

from the 6,000 year BP the datings are pretty good

Pete Heintzman (peteheintzman@gmail.com)

2021-09-10 17:47:25

ok, thanks Barbara! Then 3,000 yr as a midpoint for those two samples is reasonable.

Barbara (bbmoguel@gmail.com)

2021-09-10 17:53:26

Could be, but there are very mixed

Barbara (bbmoguel@gmail.com)

2021-09-10 17:54:13

Please tell me if I can collaborate in some point!! I am bit lost now but I would like to help 😊

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-10 17:55:49

Oh absolutely! If you see #ancientmetagenomedir you can see the most recent instructions :) we have some other papers to be added, and if you know if any other ancient shotgun metagenomics papers we are missing please add it to github as an issue!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-10 17:57:58

And we have a metadata-thon doodle for a one day event to get all the library level metadata

Barbara (bbmoguel@gmail.com)

2021-09-10 18:20:07

aahhh ok I will chek! thank you

Pete Heintzman (peteheintzman@gmail.com)

2021-09-27 19:32:20

@Barbara, re. pull request for Moguel2021: Did you get a chance to look at the discrepancy on ENA? SRS4426892 and SRS4426891 are either sample 501 or S2, depending on whether one looks at the experiment or sample alias. Which one is which?

Barbara (bbmoguel@gmail.com)

2021-09-27 19:33:18

let me see!

Barbara (bbmoguel@gmail.com)

2021-09-27 19:50:24

Yes, I am see what do you mean, the sample name is wrong in the ENA: the corresponding sample name for SRS4426892 is 50_1 and for SRS4426891 is S2.

Barbara (bbmoguel@gmail.com)

2021-09-27 19:51:13

tell me if this help !

Pete Heintzman (peteheintzman@gmail.com)

2021-09-27 19:54:48

perfect, thanks!

Barbara (bbmoguel@gmail.com)

2021-09-27 19:58:54

do you know how I can to correct the data en ENA? I am trying to find how to do that

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-09-27 19:59:11

I think you have to contact the helpdesk

Barbara (bbmoguel@gmail.com)

2021-09-27 20:03:33

ok! thank you!

Pete Heintzman (peteheintzman@gmail.com)

2021-10-25 17:49:59

In case anyone was thinking of adding the new Wang et al. 2021 dataset — the uploaded data are the aligned reads only and not raw data. So not one for the AncientMetagenomeDir. https://www.nature.com/articles/s41586-021-04016-x

Nature

Late Quaternary dynamics of Arctic biota from...

Nature - A large-scale metagenomic analysis of plant and mammal environmental DNA reveals complex ecological changes across the circumpolar region over the past 50,000 years, as biota responded to...

Original URL: https://www.nature.com/articles/s41586-021-04016-x

🤦 James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-10-25 17:50:29

*Thread Reply:* wah wah

Lennart Schreiber (lennartschreiber@web.de)

2021-11-30 10:15:39

@Lennart Schreiber has joined the channel

Anan Ibrahim (ananhamido@hotmail.com)

2021-12-16 09:16:55

@Anan Ibrahim has joined the channel

Anan Ibrahim (ananhamido@hotmail.com)

2021-12-21 14:14:39

@James Fellows Yates in the readme file for the env dir you mention that library_concentration is only applicable for the host-associated metagenome list only, but the column is still in the env lib. I have a publication that reports the concentrations shall i include them or do you want to remove the column entirely from the env list?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:16:43

Oooooh good question

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:16:49

Would you ever use that in your work?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:17:27

@Pete Heintzman @Barbara? Do you guys use library concentration information for aynthing?

We use it in microbiome stuff for tools like decontam which allow you to remove likely lab-contaminants in taxonomic profiles

Anan Ibrahim (ananhamido@hotmail.com)

2021-12-21 14:17:48

i dont think so, and the concentrations are usually in any case not reported

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:18:30

Ok then yes. If no-one else has added the information to the table already at least.

We would need to remove it from the table itself, and also the json schema

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2021-12-21 14:32:41

I would keep it

Antonio Fernandez-Guerra (antonio@metagenomics.eu)

2021-12-21 14:33:30

We are going to release soon ancient microbial environmental studies where we use this info

:salute: James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:35:20

OK then @Anan Ibrahim there is your answer ☝️

😏 Anan Ibrahim

Anan Ibrahim (ananhamido@hotmail.com)

2021-12-21 14:38:03

*Thread Reply:* Then we should note it down for the MIxS checklist

:salute: James Fellows Yates

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:44:38

*Thread Reply:* Please post it in <#C01BX7EM4EL|metadata-standards> so we don't forget!

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:35:31

I generally would think it's useful info to have

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2021-12-21 14:35:40

But leave it up to each field to decide

Barbara (bbmoguel@gmail.com)

2022-01-10 18:08:19

hello!! I didn´t use the library concentration!! just the sample DNA concentration.

Miki Bálint (miklos.balint@senckenberg.de)

2022-01-24 17:40:42

@Miki Bálint has joined the channel

Heike Zimmermann (hz@geus.dk)

2022-02-01 09:43:31

@Heike Zimmermann has joined the channel

Pete Heintzman (peteheintzman@gmail.com)

2022-02-04 10:10:24

New addition for the ’dir, although the raw data do not seem to be available yet (may change in the coming days): https://www.ncbi.nlm.nih.gov/bioproject/799375 https://www.sciencedirect.com/science/article/pii/S0277379122000191

ncbi.nlm.nih.gov

ID 799375 - BioProject - NCBI

A BioProject is a collection of biological data related to a single initiative, originating from a single organization or from a consortium. A BioProject record provides users a single place to find links to the diverse data types generated for that project

Original URL: https://www.ncbi.nlm.nih.gov/bioproject/799375

sciencedirect.com

Late Pleistocene palaeoenvironments and a possible glacial refugium on northern Vancouver Island, Canada: Evidence for the viability of early human settlement on the northwest coast of North America

Multi-proxy palaeoecological analyses of lake cores from two sites on northern Vancouver Island reveal previously undocumented non-arboreal environmen…

Original URL: https://www.sciencedirect.com/science/article/pii/S0277379122000191

👍 James Fellows Yates, Anan Ibrahim

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-02-04 11:29:43

Lets keep an eye on it and make an issue whe nthe data is available

✅ Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-02-04 11:29:49

I'm still getting through the library backlog

Pete Heintzman (peteheintzman@gmail.com)

2022-04-04 11:57:07

Now added the above publ. as a ’dir issue. Any takers? https://github.com/SPAAM-community/AncientMetagenomeDir/issues/838

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-04-04 11:58:07

I'm hoping that we get through the remaining library metadata stuff we can add all the new sample stuff as well 😉

👍 Pete Heintzman

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-04-04 11:58:14

We have quite a backlog 😱

Nikolay Oskolkov (nikolay.oskolkov@scilifelab.se)

2022-05-01 20:19:50

@Nikolay Oskolkov has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-05-31 10:19:52

Y/N?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-05-31 10:19:52

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9132974/

PubMed Central (PMC)

Lake microbiome and trophy fluctuations of the ancient hemp rettery

Lake sediments not only store the long-term ecological information including pollen and microfossils but are also a source of sedimentary DNA (sedDNA). Here, by the combination of traditional multi-proxy paleolimnological methods with the whole-metagenome ...

Original URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9132974/

✅ Pete Heintzman

Pete Heintzman (peteheintzman@gmail.com)

2022-05-31 17:09:05

Looks legit — one to include, I reckon.

Biancamaria Bonucci (biancamaria.bonucci@ut.ee)

2022-06-14 12:16:12

@Biancamaria Bonucci has joined the channel

Pete Heintzman (peteheintzman@gmail.com)

2022-07-12 17:04:18

New addition for the ’dir (Github issue already added): Courtin et al. Pleistocene glacial and interglacial ecosystems inferred from ancient DNA analyses of permafrost sediments from Batagay megaslump, East Siberia. https://onlinelibrary.wiley.com/doi/full/10.1002/edn3.336

❤️ James Fellows Yates, Nikolay Oskolkov

Vilma Pérez (vilma.bq@gmail.com)

2022-10-20 06:09:55

@Vilma Pérez has joined the channel

Kadir Toykan Özdoğan (k.t.ozdogan@uu.nl)

2022-11-11 09:06:16

@Kadir Toykan Özdoğan has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2022-11-11 09:06:43

@Kadir Toykan Özdoğan thanks again for all the PRs, unfortunately I have a deadline today and am holiday next week, but I promise first thing when I get back is to review your PRs!

Kadir Toykan Özdoğan (k.t.ozdogan@uu.nl)

2022-11-11 10:29:40

*Thread Reply:* No worries!

❤️ James Fellows Yates

Francisco Zorrilla (fz274@cam.ac.uk)

2023-01-07 17:33:00

@Francisco Zorrilla has joined the channel

Reed Harder (reedharder@gmail.com)

2023-06-01 22:03:17

@Reed Harder has joined the channel

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2023-12-15 11:51:40

@channel can anyone explain this paper to me:

https://onlinelibrary.wiley.com/doi/10.1002/edn3.493

Kadir Toykan Özdoğan (k.t.ozdogan@uu.nl)

2023-12-15 12:23:31

*Thread Reply:* I quickly had a look at it. I think what they are saying is 'the off-target' DNA from the captured libraries are similar to shotgun ones in terms of microbes.

Kadir Toykan Özdoğan (k.t.ozdogan@uu.nl)

2023-12-15 12:23:51

*Thread Reply:* but need to read it fully, I might have understood wrong

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2023-12-15 12:55:24

*Thread Reply:* Where do they mention the off-target?

Kadir Toykan Özdoğan (k.t.ozdogan@uu.nl)

2023-12-15 12:58:24

*Thread Reply:* Abstract: 'These data are consistent among study sites and between replicates processed with different methodologies (shotgun sequencing and targeted capture), which highlights that the “off-target” fraction of metagenomic data used to study macro-ecosystems can also be used to investigate synchronous changes in microbial communities.'

4.1 Structural and functional shifts in microbial communities '...three different sequencing targets—shotgun sequencing, PalaeoChip Arctic v1.0, and a Bovidae specific bait-set (Murchie, Monteath, et al. (2021); Murchie, Kuch, et al. (2021); Murchie et al. (2022))—all biological replicates cluster together (Figure 2; Figures S2 and S8).'

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2023-12-15 13:03:23

*Thread Reply:* Jesus fucking Christ the methods are so bad then?!?

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2023-12-15 13:03:42

*Thread Reply:* Well... @Kadir Toykan Özdoğan willing to make an issue on the Dir 😅

James Fellows Yates (james_fellows_yates@eva.mpg.de)

2023-12-15 13:03:45

*Thread Reply:* ?

Kadir Toykan Özdoğan (k.t.ozdogan@uu.nl)

2023-12-15 13:25:12

*Thread Reply:* haha, sure 😃