Looking for Data Engineer with Healthcare (EHR & HL7) experience - Minneapolis, MN (Remote) at Minneapolis, Minnesota, USA |
Email: [email protected] |
From: Pavan Kumar, Yochana IT [email protected] Reply to: [email protected] Mandatory Skills : Healthcare Domain (EHR & HL7), Elastic search, Python, Apache, Spark. Title: Data Engineer to Build Advanced Search Capability using the Elasticsearch stack Applicant MUST have the following skills: Elasticsearch stack knowledge (demonstrable experience in building search capability tooling using Elasticsearch). Python programming knowledge and experience. Apache Spark, in particular with PySpark API, knowledge and experience. Data pipeline experience. Excellent communication skills What Project/Projects will the candidate be working on while on assignment This data engineer will primarily be building/architecting and upgrading/augmenting a search tool using the Elasticsearch stack to search a big data volume of text documents. This person will be tasked to build a production level search application. We need to build and keep up-to-date Elastic indices to allow users external to our group to be able to search the notes. Primary Responsibilities: Design and develop production level search application for easy searching of content within a big data medical free text data asset. Work with EHR data across teams with ETL, NLP engineers and data scientists, researchers and clinicians to provide searching services with a high data quality control standard Team Description: You would be part of a small core NLP Team with 15 core team members (data scientists, project manager, medical informaticists, data analysts) with support from 12 clinical annotators integrated into the team via a 3rd party vendor What are the top 5-10 responsibilities for this position Demonstrable senior proficiency level and knowledge of the Elasticsearch stack. Programming experience, including solid Python experience, following software engineering best practices. Experience building and maintaining data pipelines and data assets. Experience Building dashboards and user interfaces using Kibana or other visualization tools. Experience with distributed data processing frameworks such as Spark or MapReduce. Experience as an individual contributor, hands-on developer, non-manager role executing on engineering projects as a primary job responsibility. Demonstrated knowledge of data management best practices Main Technologies: Currently the main technologies we are using are Apache Spark, Hadoop, Hive, Luigi, Python (and a little bit of Scala) and the platform we use is the on-prem Hadoop cluster. Candidates should be solid with at least some of these technologies, and follow good engineering practices, such as testing, code reviews and putting in place monitoring systems like dashboards or alerts. Preferred Qualifications: Experience with dashboard development in Elasticsearch Experience with data pipeline frameworks such as Airflow, Luigi or Oozie Experience with cloud-based computing (AWS or Azure) Familiarity with EHR data and standards (HL7 or FHIR) Experience with non-relational data bases Experience with code and process documentation Experience with explaining, educating, presenting and/or training non-engineers on engineering concepts and processes Experience with continuous integration and delivery Where is the work to be performed remote work We are seeking a Data Engineer who is eager to tackle the challenges of processing vast amounts of EHR data originating from multiple sources. You will need to develop a deep understanding of the data and drive efforts to maintain and improve data quality and usability. You should understand the importance and value of writing maintainable, documented, and well-tested code throughout the entire product lifecycle. Above all, you should be curious about what is possible in healthcare with the right tools and infrastructure. Keywords: information technology |
[email protected] View all |
Tue Sep 19 01:44:00 UTC 2023 |