This does not want to be an exhaustive list of skills for data scientists because the field is moving at a stellar speed (and a tool that is relevant today might not be relevant in six months). It is rather an attempt to provide an extensive list of skills and tools that are useful in developing data science projects, and of course not owning one of those skills do not preclude a data scientist to be identified as such.
Statistics and Econometrics: probability theory, ANOVA, MLE, regressions, time series, spatial statistics, Bayesian Statistics (MCMC, Gibbs sampling, MH Algorithm, Hidden Markov Model), Simulations (Monte Carlo, agent-based modeling, etc.)
Scientific approach: experimental design, A/B testing, technical writing skills, Randomized Controlled Trial
Machine Learning: supervised and unsupervised learning, CART, algorithms (Support vector Machine, PCA, GMM, K-means, Deep Learning, Neural Networks), machine learning packages (Pandas, NumPy, SciPy, etc.) and artificial intelligence packages (Tensorflow, H2O, etc.)
Mathematics: Matrix algebra, relational algebra, calculus, optimization (linear, integer, convex, global)
Big Data Platforms: Hadoop, Map/Reduce, Hive, Pig, Spark, Storm, Cassandra
Text mining: Natural Language Processing, LDA, LSA, Part-of-speech tagging, Parsing, Machine Translation
Visualization: graph analysis, social networks analysis, Tableau, ggplot, D3, Gephi, Neo4j, Alteryx
Business: business and product development, budgeting and funding, project management, marketing surveys, domain/sector knowledge
Systems Architecture and Administration: DBA, SAN, cloud, Apache, RDBMS
• Structured Dataset: SQL, JSON, BigTable
• Unstructured Dataset: text, audio, video, BSON, noSQL, MongoDB, CouchDB
• Multi-structured Dataset: IoT, M2M
Data Analysis: feature extraction, stratified sampling, data integration, normalization, web scraping, pattern recognition
Note: the above is an adapted excerpt from my book “Big Data Analytics: A Management Perspective” (Springer, 2016).