About
Yuan is a Senior Principal Software Engineer at Red Hat AI. Previously, he has led AI…
Activity
-
The job market is tough right now, especially for new graduates and talented people impacted by layoffs. I have several free 2-month LinkedIn…
The job market is tough right now, especially for new graduates and talented people impacted by layoffs. I have several free 2-month LinkedIn…
Posted by Yuan Tang
-
https://lnkd.in/gm2PM7uN Good read on production-scale LLM inference. Nice to see LMCache included as part of the stack. While llm-d focuses on…
https://lnkd.in/gm2PM7uN Good read on production-scale LLM inference. Nice to see LMCache included as part of the stack. While llm-d focuses on…
Liked by Yuan Tang
Experience
Education
-
Georgia Institute of Technology
Online part-time program (finished 8 graduate-level classes with a focus on computing systems).
Finished classes: Software Development Process, Databases, Computer Networks, Software Architecture and Design, Artificial Intelligence for Robotics, Data & Visual Analytics, Entrepreneurship, Computer Law. -
-
-
-
-
-
-
-
-
-
Licenses & Certifications
Publications
-
Couler: Unified Machine Learning Workflow Optimization in Cloud
40th IEEE International Conference on Data Engineering (ICDE)
-
metric-learn: Metric Learning Algorithms in Python
Journal of Machine Learning Research (JMLR)
-
TensorFlow Estimators: Managing Simplicity vs. Flexibility in High-Level Machine Learning Frameworks
Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD)
-
TensorFlow in Practice《TensorFlow实战》
Beijing Publishing House of Electronics Industry
Patents
-
System and Method for Distributed Task Execution
Issued CN110609749B CN110609749A
See patentThe application discloses a distributed task operation method, a distributed task operation system and distributed task operation equipment. The method flow of an embodiment of the present disclosure includes: and acquiring task fragments, distributing the task fragments to the effective computing nodes for processing, and acquiring task results. In the process of distributing task slices: each effective computing node can only distribute one task fragment at the same time, the effective…
The application discloses a distributed task operation method, a distributed task operation system and distributed task operation equipment. The method flow of an embodiment of the present disclosure includes: and acquiring task fragments, distributing the task fragments to the effective computing nodes for processing, and acquiring task results. In the process of distributing task slices: each effective computing node can only distribute one task fragment at the same time, the effective computing node starts to run the task fragment after being distributed with the task fragment, and when the effective computing node finishes the task fragment distributed with the effective computing node, the effective computing node can be distributed with a new task fragment; when the effective computing node is in error, reassigning the task fragment to which the effective computing node in error is currently assigned; when the effective computing node is closed or stolen, reallocating the task fragments currently allocated to the closed or stolen effective computing node; when a new active compute node is pulled up, unassigned task segments are assigned to the new active compute node.
-
Systems and Methods for Detecting and Remedying Software Anomalies
Issued US US10635519B1
See patentA computing platform may obtain observed data vectors related to the operation of a topology of nodes that represents a software application running on an uncontrolled platform, wherein each observed data vector comprises data values captured for a given set of operating variables at a particular point in time. After obtaining the observed data vectors, the computing platform may apply an anomaly detection model to the observed data vectors and then based on the anomaly detection model, may…
A computing platform may obtain observed data vectors related to the operation of a topology of nodes that represents a software application running on an uncontrolled platform, wherein each observed data vector comprises data values captured for a given set of operating variables at a particular point in time. After obtaining the observed data vectors, the computing platform may apply an anomaly detection model to the observed data vectors and then based on the anomaly detection model, may identify an anomaly in at least one operating variable. In turn, the computing platform may determine whether each identified anomaly is indicative of a problem related to the application, and based on a determination that an identified anomaly is indicative of a problem related to the software application, cause a client station to present a notification.
Courses
-
Artificial Intelligence for Robotics
-
-
Combinatorics
honors
-
Computer Law
-
-
Computer Networks
-
-
Concepts of Real Analysis
-
-
Data & Visual Analytics
-
-
Data Mining
graduate/honors
-
Data Structures and Algorithms
-
-
Databases
-
-
Discrete Mathematics
honors
-
Entrepreneurship
-
-
Entrepreneurship Analysis
-
-
Geometry
honors
-
Intermediate Programming
-
-
Introduction to Digital Systems
-
-
Introduction to Macroeconomics
honors
-
Introduction to Probability
-
-
Introduction to SAS programming
-
-
Introductory Programming
-
-
Linear Algebra
graduate/honors
-
Mathematical Game Theory
honors
-
Matrices
honors
-
Multivariate Calculus
honors
-
Object-oriented Programming with Web-Based Applications
honors
-
Ordinary and Partial Differential Equations
-
-
Partial Differential Equations and Fourier Transforms
-
-
Social Computing and Networking Systems
graduate/honors
-
Software Architecture and Design
-
-
Software Development Process
-
-
Topics in Statistical Computing in R
-
Projects
-
List of Contributed Open-source Projects
-
See projectI am a maintainer/committer of the following projects:
Argo: project lead of the Argo Workflows and maintainer of Argo CD. Check out awesome-argo.
Kubeflow: Kubeflow Steering Committee member, project lead and co-chair of training operators. Check out awesome-kubeflow.
KServe: Standardized serverless ML inference platform on Kubernetes.
TensorFlow: co-author of TensorFlow Estimators and maintainer of TensorFlow I/O; recipient of Google Open Source Peer Bonus in…I am a maintainer/committer of the following projects:
Argo: project lead of the Argo Workflows and maintainer of Argo CD. Check out awesome-argo.
Kubeflow: Kubeflow Steering Committee member, project lead and co-chair of training operators. Check out awesome-kubeflow.
KServe: Standardized serverless ML inference platform on Kubernetes.
TensorFlow: co-author of TensorFlow Estimators and maintainer of TensorFlow I/O; recipient of Google Open Source Peer Bonus in 2016 for my contributions to TensorFlow.
XGBoost: maintainer of the Python and R packages.
Apache MXNet: co-author of the Scala package.
Couler: designed the unified interface and contributed to many major components of the system.
ElasticDL: designed and implemented several major components of the system.
In the meantime, I (co-)authored the following projects in areas of machine learning, data visualization, and tools: TensorFlow in R, metric-learn, ggfortify, reticulate, etc.
There are also other projects that I have made non-trivial contributions to as I come across areas of improvements, namely KServe, H2O, SQLFlow, pandas, SynapseML, etc. Please visit my projects page and GitHub page for more details.
https://github.com/sponsors/terrytangyuan
Honors & Awards
-
Awards by Teams at Red Hat and IBM
Red Hat
- IBM Tech Award, Dec 10th, 2024
- Red Hat AI Engineering Jedi Award, Red Hat Multiplier and Influence, Oct 18th, 2024
- Numerous internal awards and recognitions from Red Hat colleagues -
Multiple Awards by Teams at Alibaba Group
Alibaba Group
- Inner Source Pioneer, April 17th, 2021
- Top Open Source Contributor of the Year, Jan 20th, 2020
- Best Pull Request of the Week, May 3rd, 2020 -
Outstanding China Mainland Books Copyright Exported to Taiwan
The Publishers Association of China
-
Outstanding Author
Beijing Publishing House of Electronics Industry
-
Open Source Peer Bonus
Google
https://opensource.googleblog.com/2016/09/google-open-source-peer-bonus-program.html
-
Best Virtual Reality Hack
HackRPI at Rensselaer Polytechnic Institute
http://hack.rpi.edu/2014/prizes
-
Summer Research Grant
Schreyer Honors College
Awarded $1,200 to support my summer research expenses
-
Pre-eminence in Honors Education Fund
Schreyer Honors College
Awarded $5,000 for my summer study
-
Mu Sigma Rho Statistics Honorary Society Inductee
Penn State Department of Statistics
-
Penn State PMASS fellowship & NSF PMASS fellowship
Penn State Department of Mathematics and National Science Foundation
Mathematics Advanced Studies Semester
Tuition reduced to in-state level ($6,400) and internal stipend ($6,400) -
Tsui Honors Scholarship
Schreyer Honors College
Awarded $5,300 scholarship for 2014 spring semester
-
MindSumo Programming Challenge Winner & Summit for Software Engineers
Capital One
Organizations
-
Penn State Schreyer Honors College
Scholar
Other similar profiles
Explore top content on LinkedIn
Find curated posts and insights for relevant topics all in one place.
View top content