Skip to content

Common tasks

If cocoa is the cookbook, this is the spice rack. Each task below is one question you ask a codebase, answered with the verified CLDK call and the result it returns: input above, output as a trailing comment. Every example runs against Apache Commons CLI (project_path="commons-cli"), the recurring sample across these docs, with a generic my_pkg for Python.

Enumerate every method on a class: the first thing you do when you land in unfamiliar code.

from cldk import CLDK
analysis = CLDK(language="java").analysis(project_path="commons-cli")
klass = "org.apache.commons.cli.Options"
methods = analysis.get_methods()[klass]
for signature in methods:
print(signature)
# addOption(Option)
# addOption(String, boolean, String)
# getOption(String)
# hasOption(String)
# ...

Pull the exact source of one method by its qualified class name and signature: no file reading, no line-range guessing.

method = analysis.get_method(
"org.apache.commons.cli.Options",
"addOption(Option)",
)
print(method.code)
# public Options addOption(Option opt) {
# String key = opt.getKey();
# ...
# return this;
# }

The call graph is a networkx.DiGraph with edges pointing caller → callee. It powers callers, callees, and reachability, so it requires the call_graph analysis level.

from cldk import CLDK
from cldk.analysis import AnalysisLevel
analysis = CLDK(language="java").analysis(
project_path="commons-cli",
analysis_level=AnalysisLevel.call_graph, # required for call edges
)
cg = analysis.get_call_graph()
print(cg.number_of_nodes(), "methods,", cg.number_of_edges(), "edges")
# 421 methods, 638 edges

Impact analysis: “if I change this, what breaks?” get_callers returns the set of methods that invoke the target.

callers = analysis.get_callers(
"org.apache.commons.cli.Options",
"addOption(Option)",
)
print(list(callers))
# ['org.apache.commons.cli.Options.addOption(String, boolean, String)',
# 'org.apache.commons.cli.DefaultParser.handleOption(...)', ...]

The dependency view: “what does this method reach out to?” get_callees is the mirror of get_callers.

callees = analysis.get_callees(
"org.apache.commons.cli.DefaultParser",
"parse(Options, String[])",
)
print(list(callees))
# ['org.apache.commons.cli.Options.getOption(String)',
# 'org.apache.commons.cli.CommandLine.addOption(Option)', ...]

In Java, walk class relationships with get_sub_classes, get_extended_classes, and get_implemented_interfaces.

parser = "org.apache.commons.cli.Parser"
print(analysis.get_sub_classes(parser))
# {'org.apache.commons.cli.BasicParser', 'org.apache.commons.cli.GnuParser', ...}
print(analysis.get_implemented_interfaces("org.apache.commons.cli.DefaultParser"))
# {'org.apache.commons.cli.CommandLineParser'}

Locate where the code reads or writes persistent data: useful for security triage and data-flow questions. This is a Java-only capability.

crud = analysis.get_all_crud_operations()
for entry in crud:
print(entry)
# JCRUDOperation(operation_type='READ', ...)
# JCRUDOperation(operation_type='CREATE', ...)
# ...

There is no is_reachable() method: reachability is a networkx query over the call graph CLDK hands you. Node identity is backend/version dependent, so locate endpoints by scanning node metadata once, then ask networkx for a path.

import networkx as nx
from cldk import CLDK
from cldk.analysis import AnalysisLevel
analysis = CLDK(language="java").analysis(
project_path="commons-cli",
analysis_level=AnalysisLevel.call_graph,
)
cg = analysis.get_call_graph()
# Inspect one node ONCE to learn the metadata schema for your version:
# print(list(cg.nodes(data=True))[0])
def find_node(class_name, method_decl):
for node, data in cg.nodes(data=True):
if class_name in str(data) and method_decl in str(data):
return node
return None
src = find_node("org.apache.commons.cli.CLI", "main(String[])")
sink = find_node("org.apache.commons.cli.CommandLine", "execute(String)")
print(nx.has_path(cg, src, sink))
# False -> the sink is not reachable from this entry point