ML\Backend Engineer (Python)

з/п не указана

Требуемый опыт работы: 1–3 года

Полная занятость, полный день

JetBrains is not merely an IDE-developing company, but rather an ecosystem for everything related to software engineering. Making sure that the company is successful means always looking into the future. This can take many forms, but the two pillars of this are education and research.

Quality education is necessary to create the next generation of software developers. To facilitate this, there is Stepik (https://stepik.org/), a MOOC platform that provides a wide variety of courses in SE, mathematics, but also physics, bioinformatics, and many other subjects. An alternative to course-based education is project-based education, for that, JetBrains Academy (https://hyperskill.org/) provides tracks that combine multiple projects, guiding you through the studied language and teaching you all the necessary concepts.

On the other hand, making the best products means being on the cutting edge of research achievements. That’s why we have JetBrains Research (https://research.jetbrains.org/). In the group of Machine Learning Methods in Software Engineering, we are applying data-driven approaches to software engineering, researching new ways to make our products smarter, investigating machine learning approaches, and building prototypes. Several of our projects have to do with education, specifically, helping students learn to program.

So, our group has a strong interest and a background in education-related projects, however, all the abovementioned results are currently in the prototype stage. We are looking for a person who would help us integrate these prototypes into our educational products.

In addition to what we have already worked on, there are a lot of ideas to develop:

  • Recommendation for available courses. By analyzing the sequences of courses that previous users took, we can suggest new users some tracks of their own.

  • Estimating the distribution of time within a course. It is a common practice of students to ignore the concept of weeks in MOOCs, however, we can estimate what sections of the course will take longer than others by analyzing how previous users passed a given course and how a given user passed previous courses.

  • Modeling the knowledge graph of the course or the whole domain involving multiple courses and using it to make the predictions even better or suggest teachers ways to improve their courses.

  • Helping students write good code, not only the correct code. It might be possible to estimate the complexity of a user’s solution even if it is passing all tests and show them other users’ solutions just to make sure that they see that a given task can be solved in a simpler way. Or implement refactoring recommendation techniques to eliminate code duplicates and other code smells.

  • Providing a user with more stats. It is nice to see that you passed a task that only 15% of users passed, but it would also be very interesting to see other comparisons too: code quality, attempts, etc.

The candidate is expected to:

  • Know and be able to apply mathematical statistics;

  • Understand ML/DL;

  • Know how to work with conventional ML models;

  • Be proficient in Python;

  • Use Numpy, Pandas, Sklearn, Scipy;

  • Handle data visualization;

  • Have experience in industrial development with Python frameworks.

Additionally, it would be great to have experience with:

  • Django;

  • Pythorch or other DL frameworks (Tensorflow, Keras);

  • DL models integration;

  • Developing educational projects;

  • Studying online or teaching.

Currently, we are working on generating even more personalized hints for students. We have developed a set of tools for collecting and processing students' activities during the solving of programming assignments. These tools include a plugin for several IntelliJ-based IDEs that captures snapshots of code and IDE interaction events during the writing of code, thus allowing us to analyze the programming process. The plugin currently supports Python, Java, Kotlin, and C++. The general idea is the same here: use the data showing us how the assignment was solved previously to help with the future ones. Here we know not only the resulting mistake but also the path leading to it. This way, when a new user makes a mistake or just gets stuck somewhere, we can find the most similar previous case and suggest a specific edit that would bring the user closer to the correct solution.

Ключевые навыки

Python

Вакансия опубликована 20 февраля 2021 в Санкт-Петербурге

Похожие вакансии