Can we spot the "good" from the "bad" data. Natural phenomena is not supposed to have those equal probabilities for two different conditions, otherwise we wouldn't have natural laws, isn't it? if someone attempts to generate artificial data with something that looks like the distribution of raw data coming from natural phenomena, for me it has at least a flavor of naiveness (because one is assuming a probability distribution which real data may not have), or it is a moot point, because you have fallen into a tautology: you already have arrived at THE law governing the generation of your natural data, so of course you can get artificial data: such is called Numerical Simulation, and if successfully reproduces some raw data, that is a paper. A random number generator, as I said, entails equal probabilities. Real science is about results being reproducible, meaning that given the same experimental conditions (say, pressure, volume, temperature, purity of materials and accuracy of measurement), you ought to be able to get similar results. Having said that, the discussion here is whether someone can actually produce artificially data which would resemble the result of an experiment. A rather harsh option would be to decouple career and science - why should a scientist make a career anyway? There is much to think about and I do not have good solutions! I am just mentioning the problems! As soon as science and mony get entangled, science will sooner or later degrade to a business and work like a buisiness - effective in creating anticipated results that are not very trustworthy, but ineffective to be creative and thorough). by independent funding, also should careers not depend on "good results" but rather on "good science". We have to create an environment to make such faith possible (e.g. Science requires faith, faith in the understanding and expertise of the scientists and the correct application of methods (for gererating and for analysing data). If we have reasons to distrust scientists, science is actually dead (and became just a business, at best). But I see the main advantage that the data might be re-used, re-analyzed, meta-analyzed and so on. Having these data, one could analyse if there was some manipulation (at least in principle). I wish that journals would demand to submit the complete data. I would like people to question data, and start accepting that more than often controversial data all there ever was. If one ponders on this too much, one gets more afraid of medicines that diseases, and loses all faith in biomedical research, yet this is crude reality and cannot be hidden by data manipulation and beautiful oratory as it is done today. This last example is the most common justification of most cancer researchers for data manipulation (removal of "weird results") and irreproducibility. Excel dummy data generator trial#This is dramatically illustrated by the extreme variability among cloned mice with well-known genes to the same clinical trial done at the same time and conditions, and in low number of repetitions (limited by price, time, manpower, bioethics). Unfortunately the scientific method relies on reproducibility of results yet in biology one can never be sure if they will always come out the same (actually never will, but there should be some similarity) every time and everywhere even under same conditions. Just a quick notice, in biology I deal with living beings which are complex, unpredictable experimental subjects, and complex biochemistry is just one thing behind that. Most believe descriptive statistics will be a reliable fingerprint of the original data, and I think this naive notion must be challenged as well, thus I am appreciating here any ideas and tools for demonstration here. Basic statistical descriptions and tests over obtained data are seldom challenged, and when so any big dataset of numbers produced by the authors will generally silence questioners. This has been revealing much about the true evidence behind these papers, and making it evident that traditional idea of Published-Impact-Factor-Peer-Reviewed standards are not indicative of reliability in scientific literature. Nowadays, on PubPeer, papers are questioned post-publication and exposed based on obvious signs of manipulation of images, particularly in blots from cancer research papers. In biological and biomedical sciences conclusions on papers are drawn on statistical analysis illustrated by very simple illustrations of the obtained results (even in the most "trusted" periodicals), and repeating experiments under exactly same conditions is very rare, quite often impossible in practical terms. The first step is to load pandas package and use DataFrame functionÄata = pd.Yet in my field this is far from reality.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |