Dimitar Lukarski
Welcome to my homepage!
About me
I like mixing applied mathematics
and computer science, luckily I am able to do this for living - I
work on interdisciplinary topics in the area of computational
mathematics and emerging parallel hardware such as GPUs,
multi-core CPUs and embedded devices.
After my PhD at Karlsruhe Institute of Technology (Germany) and
post-doc at Uppsala University (Sweden), I started a small
start-up where we developed a fast sparse linear algebra library
(called PARALUTION) with special focus on multi-core CPUs and GPUs
hardware support. We had two very exciting years with various
customers from super computer centers to multi-physics simulations
companies. At the end, we managed a successful exit by selling the
copyright of the software to one of the semiconductor giants.
Currently, I live in Silicon Valley where I work for Apple at
Special Project Group. Due to the nature of the project, I cannot
give any details. As a high level description, my day to day job
involves a lot of C++ programming, architecture and software
design, as well as intra-/extra-team communication and
coordination.
Contact
dimitar [AT] lukarski [DOT] com
http://de.linkedin.com/in/dimitarlukarski
Work
Experience
Interests
Education
Teaching Experience
Invited Talks and Visits
Book Chapters
Journal Articles
Conference and Workshop Proceedings
Pre–prints / Technical Reports
Conference and Workshop Talks
Committee Member
Google
Scholar Account
Work Experience
Oct 2016 : present
– Senior
Software Engineer at Apple Inc., Silicon Valley,
CA, USA
Jan 2015 : June 2016 – Co-founder and CEO at
PARALUTION Labs, Karlsruhe, Germany
August 2012 : July 2014 – Post
Doctoral Researcher, Uppsala University, Sweden
Uppsala
Programming for Multicore Architectures Research Center
(UPMARC),
Department of Information Technology, Division
of
Scientific Computing
Sept 2008 : June 2012 - Research
Associate, Karlsruhe Institute of Technology (KIT), Germany
Shared Research Group “New Frontiers in
High Performance Computing Exploiting Multicore and
Coprocessor Technology”
with collaboration partner
Hewlett-Packard
Sept 2005 : June 2006 -
Teaching Assistant, Technical University of Sofia, Bulgaria
April 2005 : Dec 2005
- Internal Contract, Technical University of Sofia, Bulgaria
"Asymptotic Methods for Differential Equations.
Research and Control", Contract 562/NI-10/2005
March 2003 : June 2006 - Sky
Archive Data Center, Bulgarian Academy of Science
System Support and Development of Software
"Wide-field Plate Database",
Contract I-1103/2001 with Space Research
Institute, National Science Fund (NSF), Ministry of Education and
Science (MES)
March 2003 : Oct 2004 -
Institute of Zoology, Bulgarian Academy of Science
System and Network Administrator
Interests
- Modern C++
- Real-time systems
- Embedded devices
- Low energy chips
- Low latency computation
- High performance computing (HPC)
- Parallel programming / stream-based models
- Clusters and supercomputers
- Heterogeneous and homogeneous computing
- Fine-grained parallel algorithms
- Computational mathematics
- Software design and implementation
- Sparse graphs and matrices
Education
Sept 2008 : Jan 2012 -
Karlsruhe Institute of Technology (KIT), Germany
Department of Mathematics, Dr. rer. nat. (Ph.D. Natural
Sciences)
Thesis:
Parallel Sparse Linear Algebra for Multi-core and Many-core
Platforms - Parallel Solvers and Preconditioners
Advisor:
Jan-Philipp Weiss (KIT, Germany)
Reviewers:
Vincent
Heuveline (KIT, Germany) and Richard Vuduc (Georgia Institute of
Technology, USA)
Examination
grade: 1.0 (magna cum laude)
Download
(PDF)
Oct 2006 : Aug 2008 -
University of Karlsruhe (TH), Germany
Department of Mathematics, M.Sc. (Master of Science)
2006-2008 - DAAD (German Academic Exchange Service) scholarship holder
2007 - “DAAD-Preis für hervorragende
Leistungenausländischer Studierender” - best student award
2008 - Graduated top of the class
Thesis:
Specific Aspects of a Parallel Implementation of a 3D CFD Solver
on the Cell Architecture
Feb 2005 : June 2005 -
University of Ioannina, Greece
Department of Mathematics, Erasmus student
Oct 2002 : July 2006 -
Technical University of Sofia (TU-Sofia), Bulgaria
Department of Applied Mathematics and
Informatics, B.Sc.
(Bachelor of Science)
2006 – Graduated top of the class
Thesis:
Modeling of Thin Plates with Structural Defects - Finite Elements
Method
Sept 1997 : May 2002 - High
School ‘Electronic Systems’ (TU-ES), Sofia, Bulgaria
Teaching Experience
- Lecture assistant and co-developer of lecture slides/notes
(Uppsala University, Sweden)
- "Parallel Programming for Scientific Computing" 2014 (PhD
course)
- "Parallel Programming" ST 2013, ST 2014
- "High Performance Computing II: Algorithms and Applications"
(NGSSC course) 2012
- Co-supervision of 4 Diploma (master thesis) students, 1
bachelor student (KIT, Germany)
- Co-supervision of seminar “Parallel Computing on GPU” WT
2011/2012 (KIT, Germany)
- Lecture assistant and co-developer of lecture slides/notes
(KIT, Germany)
- “Parallel Computing” WT 2011/2012
- “Numerical Simulation in the Many-core Era” ST 2010
- Lecture assistant “Exercises with Maple” for engineers and
mathematicians 1st and 2nd semester students, WT
2005/2006, ST 2006 (TU-Sofia, Bulgaria)
Invited Talks and Visits
- Sofia, Bulgaria Academy of Science (Institute of Information
and Communication Technologies): A Library for Iterative Sparse
Methods on CPU and GPU, June, 2014
- TU Delft, Netherlands: Fine-grained parallel iterative solvers
in OpenFOAM - utilizing multi-core and GPU devices with
PARALUTION, Setp, 2013
- Swiss National Supercomputing Centre (CSCS), Switzerland:
PARALUTION, - Tutorial and hands-on, CSCS-FoMICS-USI Autumn
School, Setp 2013
- Uppsala University, Sweden: Sparse linear algebra on
heterogeneous platforms - state of art and trends,
"Heterogeneous computing - impact on algorithms" ComplexHPC
Spring School, June, 2013
- Università della Svizzera italiana, Switzerland: Iterative
Preconditioned Solvers on Multi- and Many-core Platforms, July
2012
- International Workshop on Efficient Solvers in Biomedical
Applications, Mariatrost, Austria: Fine-grained Parallel Solvers
and Preconditioners on Multi-core CPUs and GPUs- The Next Steps
in the Biomedical Engineering, July, 2012 (presented by Vincent
Heuveline)
- Accelerated HPC Symposium (Organized by Los Alamos National
Laboratory): San Jose, USA, Iterative Solvers for Many-core
Chips - Fault Tolerant Solvers, GTC Conference, May, 2012
- NVIDIA, Santa Clara, USA: Parallel Sparse Solvers and
Preconditioners for Multi- and Many-core Platforms, April 2012
- Uppsala University, Sweden: Preconditioners for multi- and
many-core platforms (GPU), Division of Scientific Computing, IT,
Feb, 2012
- Jülich Supercomputing Center (Forschungszentrum Jülich):
Germany, Parallel preconditioners for multi-core CPU and GPU
platforms, June, 2011
Book
Chapters
- D. Lukarski, M. Neytcheva: On the impact of the heterogeneous
multi- and many-core platforms on iterative solution methods and
preconditioning techniques, Book Chapter in “High-Performance
Computing on Complex Environments”, Editors: E. Jeannot, J.
Zilinskas, Publisher: John Wiley & Sons, Inc., ISBN
978-1-118-71205-4, 2014
Journal
Articles
- D. Hoske, D. Lukarski, H. Meyerhenke, M. Wegner: Engineering a
Combinatorial Laplacian Solver: Lessons Learned, Journal
Algorithms, 2016, Volume 9, Issue 4, page 72, Multidisciplinary
Digital Publishing Institute, 2016
- S. Engblom, D. Lukarski: Fast Matlab compatible sparse
assembly on multicore computers, Journal of Parallel Computing -
Systems & Applications, Vol 56, pages 1-17, DOI:
10.1016/j.parco.2016.04.001, 2016
- R. Gupta, D. Lukarski, M. B. van Gijzen, C. Vuik: Evaluation
of the deflated preconditioned CG method to solve bubbly and
porous media flow problems on GPU and CPU,
International Journal for Numerical Methods in Fluids, vol. 80,
issue 11, pp. 666-683, DOI: 10.1002/fld.4170, 2015
- N. Trost, J. Jiménez, D. Lukarski, V. Sanchez: Accelerating
COBAYA3 on multi-core CPU and GPU systems using PARALUTION,
Annals of Nuclear Energy, Elsevier, DOI:
10.1016/j.anucene.2014.08.005, 2014
- V. Heuveline, D. Lukarski, J.-P. Weiss: Performance of a
Stream Processing Model on the Cell BE NUMA Architecture Applied
to a 3D Conjugate Gradient Poisson Solver, Int. J. Computational
Science, 3(5), pp. 473-490, 2009
- D. Marinova , D. Lukarski , G. Stavroulakis: Modeling and
Optimal Control Design for Plates with Defects, Journal of
Vibration and Control, Vol. 13, No. 9-10, 1343-1353 (2007), DOI:
10.1177/1077546307077501, 2007
Conference
and
Workshop Proceedings
-
D. Weller, F. Oboril, D. Lukarski, J. Becker, M. Tahoori: Energy
efficient scientific computing on fpgas using opencl,
Proceedings of the 2017 ACM/SIGDA International Symposium on
Field-Programmable Gate Arrays 2017, Pages 247-256,
ACM, 2017
- E. Bergamini, M. Wegner, D. Lukarski, H. Meyerhenke:
Estimating current-flow closeness centrality with a multigrid
laplacian solver. In Proc. of the Seventh SIAM Workshop on
Combinatorial Scientific Computing 2016, Society for Industrial
and Applied Mathematics, 2016
- D. Hoske, D. Lukarski, H. Meyerhenke, M. Wegner: Is
Nearly-linear the same in Theory and Practice? A Case Study with
a Combinatorial Laplacian Solver. In Proc. 14th Intl. Symp. on
Experimental Algorithms (SEA 2015), pp. 205-218. LNCS 9125,
Springer-Verlag, 2015
- A. Dorostkar, D. Lukarski, B. Lund, M. Neytcheva, Y. Notay, P.
Schmidt: CPU and GPU performance of large scale numerical
simulations in Geophysics, Euro-Par 2014
- D. Lukarski, H. Anzt, S. Tomov, J. Dongarra: Hybrid
Multi-Elimination ILU Preconditioners on GPUs, International
Heterogeneity in Computing Workshop (HCW), Phoenix, 2014
- A. Dorostkar, D. Lukarski, B. Lund, M. Neytcheva, Y. Notay, P.
Schmidt: Performance study of block-preconditioned iterative
methods on multicore computer systems and GPU, Parallel Matrix
Algorithms and Applications (PMAA14), Lugano, 2014
- H. Anzt, D. Lukarski, S. Tomov, J. Dongarra: Self-Adaptive
Multiprecision Preconditioners on Multicore and Manycore
Architectures, VECPAR 2014, Eugene, OR, 2014
- S. Suwelack, D. Lukarski, V. Heuveline, R. Dillmann, S.
Speidel: Accurate Surface Embedding for Higher Order Finite
Elements, Proceedings of the 12th ACM SIGGRAPH/Eurographics
Symposium on Computer Animation 2013
- V. Heuveline, D. Lukarski, N. Trost, J.-P. Weiss: Parallel
Smoothers for Matrix-based Geometric Multigrid Methods on
Locally Refined Meshes Using Multicore CPUs and GPUs, Facing the
Multicore-Challenge II, Springer, LNCS 7174, p.158, 2012
- V. Heuveline, D. Lukarski, C. Subramanian, J.-P. Weiss:
Parallel Preconditioning and Modular Finite Element Solvers on
Hybrid CPU-GPU Systems, in P. Iv́nyi, B.H.V. Topping (Eds.),
Proc. of the Sec. Int. Conf. on Parallel, Distributed, Grid and
Cloud Computing for Engineering (ParEng 2011), Civil-Comp Press,
Stirlingshire, UK, Paper 36, 2011
- V. Heuveline, D. Lukarski, F. Oboril, M. B. Tahoori, J.-P.
Weiss: Numerical Defect Correction as an Algorithm-Based Fault
Tolerance Technique for Iterative Solvers, 17th IEEE Pacific Rim
Int. Symp. on Dependable Computing (PRDC), 2011
- H. Anzt, W. Augustin, M. Baumann, T. Gengenbach, T., Hahn, A.
Helfrich-Schkarbanenko, V. Heuveline, E. Ketelaer, D. Lukarski,
A. Nestler, S. Ritterbusch, S. Ronnås, M. Schick, M.
Schmidtobreick, C. Subramanian, J.-P. Weiss, F. Wilhelm, M.
Wlotzka, HiFlow3 - A Hardware-Aware Parallel Finite
Element Package Proceedings, 5th Parallel Tools Workshop,
Dresden, Springer, 2011
- V. Heuveline, C. Subramanian, D. Lukarski, J.-P. Weiss: A
Multi-platform Linear Algebra Toolbox for Finite Element Solvers
on Heterogeneous Clusters, Workshop on Parallel Programming and
Applications on Accelerator Clusters (PPAAC), IEEE Cluster,
Crete, 2011
- V. Heuveline, D. Lukarski, J.-P. Weiss: Scalable
Multi-Coloring Preconditioning for Multicore CPUs and GPUs,
UCHPC’10 Workshop, Euro-Par 2010 Parallel Processing Workshops,
Springer LNCS Vol. 6586, pp. 389-397, 2010
- V. Heuveline, D. Lukarski, J.-P. Weiss: RapidMind Stream
Processing on the PlayStation 3 for a Chorin-based Navier-Stokes
Solver. In Proc. 1st Int. Workshop on New Frontiers in
High-performance and Hardware-aware Computing (HipHaC 08), pp.
31 - 38, Lake Como, Italy, 2008, Universitaetsverlag Karlsruhe
- D. Marinova , D. Lukarski , G. Stavroulakis: Modeling and
Optimal Control Design for Plates with Defects, Proc. MME06
(Mathematical Methods in Engineering), Ankara, Turkey, 2006
- D. G. Marinova, G.E. Stavroulakis, E.C. Zacharenakis, D.H.
Lukarski: Active optimal control of damaged smart plates in
bending, Proc. 6th ESMC (6th European Solid Mechanics
Conference), Budapest, Hungary, 2006
- G.E. Stavroulakis, D.G. Marinova, D.H. Lukarski, E.C.
Zacharenakis: Nondestructive Identification of defects for smart
plates in bending using genetic algorithms, Proc. 3-d ECCM2006,
Lisbon, Portugal, 5-8 June 2006. (ECCM2006 - III European
Conference on Computational Mechanics Solids, Structures and
Coupled Problems in Engineering), Page 165, DOI
10.1007/1-4020-5370-3, ISDN ISDN: 978-1-4020-4994-1,
Springer, 2008
- D. Marinova, G.E. Stavroulakis, D.H. Lukarski: Active control
of plates in bending. Modeling and influence of damage, Proc.
Vol. AMEE’05 (Applications of Mathematics in Engineering and
Economics), Sozopol, Bulgaria, 2005
Pre–prints
/
Technical Reports
- D. Hoske, D. Lukarski, M. Wegner, H. Meyerhenke: What
'Provably Fast' Can Mean in Practice: A Case Study with a
Combinatorial Laplacian Solver. Extended version of SEA'15
paper.
- S. Engblom, D. Lukarski: Fast Matlab compatible sparse
assembly on multicore computers, Preprint, 2014
- A. Dorostkar, D. Lukarski, B. Lund, M. Neytcheva, Y. Notay, P.
Schmidt: Parallel Performance Study of Block-Preconditioned
Iterative Methods on Multicore Computer Systems, Technical
report 2014-007, University of Uppsala, 2014
- D. Lukarski, T. Skoglund: A Priori Power Estimation of Linear
Solvers on Multi-Core Processors, Technical Report 2013-020, IT
Uppsala University, 2013
- V. Heuveline, D. Lukarski, N. Trost, J.-P. Weiss: Parallel
Smoothers for Matrix-based Multigrid Methods on Unstructured
Meshes Using Multicore CPUs and GPUs, EMCL Preprint 2011-09,
2011
- V. Heuveline, D. Lukarski, J.-P. Weiss: Enhanced Parallel
ILU(p)-based Preconditioners for Multi-core CPUs and GPUs - The
Power(q)-pattern Method, EMCL Preprint 2011-08 , 2011
- H. Anzt, W. Augustin, M. Baumann, H. Bockelmann, T.
Gengenbach, T. Hahn, V. Heuveline, E. Ketelaer, D. Lukarski, A.
Otzen, S. Ritterbusch, B. Rocker, S. Ronnas, M. Schick, C.
Subramanian, J.-P. Weiss, F. Wilhelm: Hiflow3 - A Flexible and
Hardware-Aware Parallel Finite Element Package, EMCL Preprint
2010-06, 2010
Conference
and
Workshop Talks
- D. Lukarski: "PARALUTION - Library for Iterative Sparse
Methods", GPU Technology Conference (GTC), San Jose, 2014
- D. Lukarski: "PARALUTION - Library for Iterative Sparse
Methods on Multi-core CPU and GPU Devices", NVIDIA
GPU Technology Theater, Denver, SC13, 2013
- N. Trost, J. Jiménez, D. Lukarski, V. Sanchez: Accelerating
COBAYA3 on multi-core CPU and GPU systems using PARALUTION,
SNA+MC 2013 International Conference, 2013
- J.-P. Weiss, D. Lukarski, V. Heuveline: Parallel
preconditioners and multigrid methods for sparse systems on
GPUs, SIAM Conference on Applied Linear Algebra, Valencia,
June 2012
- V. Heuveline, R. Lohner, D. Lukarski, J.-P. Weiss: Advances in
Hardware-aware & Heterogeneous Computing, Birds-of-a-Feather
Session, International Supercomputing Conference, Hamburg, June
2012
- D. Lukarski, J.-P. Weiss: The Implications of GPUs for
Parallel Numerical Simulation, Minisymposium High performance
linear Algebra on GPUs, Gesellschaft für Angewandte Mathematik
und Mechanik (GAMM) Conference, Darmstadt, Germany, March 26,
2012
- D. Lukarski, J.-P. Weiss: Fine-Grained Parallel
Preconditioners for Fast GPU-based Solvers, GPU Technology
Conference (GTC), San Jose, May, 2012
- D. Lukarski, J.-P. Weiss: LAtoolbox: A Multi-platform
Sparse Linear Algebra Toolbox, GPU Technology Conference (GTC),
San Jose, May, 2012
- H. Anzt, W. Augustin, M. Baumann, T. Gengenbach, T. Hahn, A.
Helfrich-Schkarbanenko, V. Heuveline, E. Ketelaer, D. Lukarski,
A. Nestler, S. Ritterbusch, S. Ronnas, M. Schick, M.
Schmidtobreick, C. Subramanian, J.P. Weiss, F. Wilhelm, M.
Wlotzka: HiFlow3 – A Multi-Purpose and Flexible Parallel Finite
Element Package,Open Source CFD International Conference,
Paris-Chantilly, France, 2011
- V. Heuveline, D. Lukarski, J.-P. Weiss: Fine-grained Parallel
ILU Preconditioners with Fill-ins for Multi-core CPUs and GPUs,
International Conference On Preconditioning Techniques For
Scientific And Industrial Applications (Abstract Proceedings),
Bordeaux, France, May, 2011
Committee Member
- UnConventional High Performance Computing Workshop (UCHPC),
Grenoble, France, 2016, Organizing Committee and Program
Committee
- Management Committee of the EU COST Action IC1305 Network for
Sustainable Ultrascale Computing (NESUS), 2014, representative
for Sweden
- UnConventional High Performance Computing Workshop (UCHPC),
Portugal, 2014, Organizing Committee
- Enhancing Parallel Scientific Applications with Accelerated
HPC Workshop (ESAA), Japan, 2014, Technical Program Committee
- Techniques and Applications for Sustainable Ultrascale
Computing Systems Workshop (TASUS), Portugal, 2014, Technical
Program Committee
- ComplexHPC Spring School 2013, "Heterogeneous computing -
impact on algorithms", June 3-7, 2013, Uppsala University,
Uppsala, Sweden, Program Committee
- Facing the Multicore-Challenge Conference, Germany, 2012,
Technical Program Committee
- Facing the Multicore-Challenge Conference, Germany, 2011,
Technical Program Committee
Last update: March 2018