An Introduction to generative AI and development tools
A brief introduction to modern development tools. Integrated development environments. Language server protocol. Generative code assistant tools.
1 Integrated Development Environment
- An Integrated Development Environment (IDE) is a software application providing tools to facilitate programming activities.
1.1 Overview
- There are many IDEs with varying feature sets.

Overview
- IDEs can specialize in a particular programming language or support multiple languages.
- Although the provided features vary across IDEs and languages, at a minimum, modern IDEs provide tools for editing, executing, and debugging code.
1.2 More than a text editor
A long time ago in a galaxy far, far away…
More than a text editor
- Code editing was done in a text editor, generating source code files.
- Source code was compiled with a software application called the compiler, generating machine language files.
- Machine language files were linked among them and with other pre-existing libraries via a software application called the linker, generating an executable program.
- If the program did not execute as expected, it was passed to a software application called the debugger to identify potential issues.
More than a text editor
- The (development) process of editing, translating, executing, and debugging was repeated until the program was working as expected.
- During these iterations, one had to switch between different environments and tools many times.
- The idea of IDEs is to integrate all needed tools in a single environment to increase productivity.

1.3 Modern IDEs
- Modern IDEs enhanced the set of provided features beyond the development cycle.
- Code navigation.
- Code completion.
- Code refactoring and renaming.
- Code formatting.
- And many more…
Modern IDEs
- And many more…
- So many that programming with an IDE is an entirely different experience than programming in the same language without one or with a different one.
- The efficiency of working with an IDE led to a high number of different IDEs.
- IDE providers strived to include as many modern features as possible, not to stay behind competition.
Modern IDEs
- Further, with more and more programming languages and frameworks being developed, the number of times that the same feature had to be implemented grew quadratically.
Modern IDEs
- For \(L\) languages and \(I\) IDEs, every feature had to be implemented \(L \times I\) times.

Modern IDEs
- Besides work duplication, not all feature implementations were identical across languages and IDEs.
- So what if one learns to program with an IDE and then works in a company that uses a different one?
2 Language Server Protocol
A new hope
2.1 What is the LSP?
- The Language Server Protocol (LSP) is a specification of communication rules between IDEs and language servers.
- Originally developed by Microsoft.
- In 2016, Microsoft partnered with Red Hat and Codenvy to develop an open standard for the LSP.
- Today, the LSP is largely adopted by most IDEs and programming languages.
2.2 Why was it so successful?
- By specifying the communication rules, the LSP reduces the \(L \times I\) to an \(L + I\) implementation problem.

Why was it so successful?
- Programming languages implement language servers that understand the LSP.
- IDEs implement clients that understand the LSP.
- No need for feature implementation duplication.

2.3 Why should I use care?
- Added benefit: More consistent feature implementations across IDEs.
- Learning to program with an IDE that supports the LSP makes it easier to switch to another IDE that also supports the LSP.
2.4 LSP in R?
- The
languageserverpackage provides an LSP implementation for theRprogramming language.
- It supports:
- Code completion.
- Code navigation.
- Code formatting (via the
stylerpackage). - Code refactoring.
- Code linting (via the
lintrpackage).
3 AI coding assistants
The force awakens
3.1 Large Language Models
- Large language models are machine learning models for natural language processing.
- They receive as input a sequence of tokens (usually words) and output a sequence of tokens.
- Their output is one of the most likely sequences of tokens given the input.
- Since programs are sequences of keywords and symbols, one of the most successful applications of large language models is code generation.
Large Language Models
- There is enormous commercial interest for companies if AI can generate safe and efficient code less costly than humans.
- However, whether this is indeed feasible is not straightforward to answer.
- Unlike humans, generative AI models do not generate code based on requirements but based on the statistical correlation of what is more likely to follow.
- Asking a generative AI model to generate code for the same task multiple times results in different outputs.
Reproducibility


Reproducibility
- Every time we query the model, it generates a solution from scratch.
- Humans generating code for the same task a second time are more likely to work on enhancing the existing code instead of starting from scratch.
Reproducibility
- This is, perhaps, not a big issue for small programming tasks.
- But what if your task involves thousands of lines of code distributed across multiple files?
- Is it feasible to examine and deal with the complexity of the generated code every time from scratch?
- This raises doubts about the long-term maintainability of AI-generated code.
3.2 Hallucinations
- Another issue with generative AI models is hallucinations.
- Hallucinations in code generation manifest in a few ways.
Hallucinations
- Hallucinations in code generation manifest in a few ways.
- When working with self-developed or niche libraries, the model may not have seen enough examples to generate correct code.
- It is likely to generate code that is syntactically correct, but it involves function calls and module imports that do not exist.
Hallucinations
- Hallucinations in code generation manifest in a few ways.
- When you have a logical error in your code, the model may generate code that is syntactically correct but reinforces or replicates the logical error.
- This is because the model does not generate code based on requirements but based on what is more likely to follow what you have already written.
Hallucinations
3.3 Context awareness
- Another issue commonly encountered with coding assistants is the lack of context awareness.
- Many implementations can correctly solve a programming task.
- In addition, all of them can be equally efficient, safe, and maintainable.
- However, not all implementations are equally appropriate for all contexts.
3.4 Context awareness
- Another issue commonly encountered with coding assistants is the lack of context awareness.
- Some implementations may fit better when paired with other parts of the code and the overarching goals of the project.
- Nonetheless, coding assistants do not have information about the project’s goals or the rest of the codebase in all cases.
- Eventually, the evaluation of the appropriateness of a generated solution remains a human task.
3.5 Working with AI coding assistants
- Consequently, AI coding assistants fundamentally change the way we program.
- Researching a solution:
- Without: More tedious and time-consuming. Reading documentation, searching for existing solutions, implementations, and libraries.
- With: Automatically generated.
Working with AI coding assistants
- Consequently, AI coding assistants fundamentally change the way we program.
- Editing code:
- Without: More manual and slow. Omissions, logical errors, and typos can creep in.
- With: More automated. Omissions, logical errors, and typos can still creep in.
Working with AI coding assistants
- Consequently, AI coding assistants fundamentally change the way we program.
- Reviewing and debugging code:
- Without: Easier to review self-written code because the logic is known.
- With: Harder to review. Need to understand the logic. Need to understand how used functions and modules work (documentation).
3.6 No free lunch
- Consequently, AI coding assistants fundamentally change the way we program.
- Overall, working with AI coding assistants removes responsibilities from the research stage but creates new ones in the reviewing stage.